Live chat application using Node JS Socket IO and JSON file - json

I am developing a Live chat application using Node JS, Socket IO and JSON file. I am using JSON file to read and write the chat data. Now I am stuck on one issue, When I do the stress testing i.e pushing continuous messages into the JSON file, the JSON format becomes invalid and my application crashes.Although I am using forever.js which should keep application up but still the application crashes.
Does anybody have idea on this?
Thanks in advance for any help.

It is highly recommended that you re-consider your approach for persisting data to disk.
Among other things, one really big issue is that you will likely experience data loss. If we both get the file at the exact same time - {"foo":"bar"} - we both make a change and you save it before me, my change will overwrite yours since I started with the same thing as you. Although you saved it before me, I didn't re-open it after you saved.
What you are possibly seeing now in an append-only approach is that we're both adding bits and pieces without regard to valid JSON structure (IE: {"fo"bao":r":"ba"for"o"} from {"foo":"bar"} x 2).
Disk I/O is actually pretty slow. Even with an SSD hard drive. Memory is where it's at.
As recommended, you may want to consider MongoDB, MySQL, or otherwise. This may be a decent use case for Couchbase which is an in-memory key/value store based on memcache that persists things to disk ASAP. It is extremely JSON friendly (it is actually mostly based on JSON), offers great map/reduce support to query data, is super easy to scale to multiple servers, and has a node.js module.
This would allow you to very easily migrate your existing data storage routine into a database. Also, it provides CAS support which will prevent you from data loss in the scenarios outlined earlier.
At minimum though, you should possibly just modify an in memory object that you save to disk ever so often to prevent permanent data loss. However, this only works well with 1 server and then you're back at likely needing to look at a database.

Related

Best way to store/edit GEOJSON working on Wordpress [duplicate]

I want to create a website that will have an ajax search. It will fetch the data or from a JSON file or from a database.I do not know which technology to use to store the data. JSON file or MySQL. Based on some quick research it is gonna be about 60000 entries. So the file size if i use JSON will be around 30- 50 MB and if use MySQL will have 60000 rows. What are the limitations of each technique and what are the benefits?
Thank you
I can't seem to comment since I need 50 rep. for commenting, so I will give it as an answer:
MySQL will be preferable for many reasons, not the least of which being you do not want your web server process to have write access to the filesystem (except for possibly logging) because that is an easy way to get exploited.
Also, the MySQL team has put a lot of engineering effort into things such as replication, concurrent access to data, ACID compliance, and data integrity.
Imagine if, for instance, you add a new field that is required in whatever data structure you are storing. If you store in JSON files, you will have to have some process that opens each file, adds the field, then saves it. Compare this to the difficulty of using ALTER TABLE with a DEFAULT value for the field. (A bit of a contrived example, but how many hacks do you want to leave in your codebase for dealing with old data?) so to be really blunt about, MySQL is a database while JSON is not, so the correct answer is MySQL, without hesitation. JSON is just a language, and barely even that. JSON was never designed to handle anything like concurrent connections or any sort of data manipulation, since its own function is to represent data, not to manage it.
So go with MySQL for storing the data. Then you should use some programming language to read that database, and send that information as JSON, rather than actually storing anything in JSON.
If you store the data in files, whether in JSON format or anything else, you will have all sorts of problems that people have stopped worrying about since databases started being used for the same thing. Size limitations, locks, name it. It's good enough when you have one user, but the moment you add more of them, you'll start solving so many problems that you would probably end up by writing an entire database engine just to handle the files for you, while all along you could have simply used an actual database. Do note! Don't take my word for granted, I am not an expert on this field, so let others post their answer and then judge by that. I think enough people here on stackoverflow have more experience then I do haha. These are NOT entirely my words, but I have taken out the parts that were true from what I knew and know and added some of my own knowledge :) Have a great time making your website
For MySQl :you can select specific rows,or specific column using queries ,filter data based on a key,order alphabetically
downside:need a REST API to fetch data because it can't be accessed directly,you have to use php or python or whatever programming language for backend code.
for json file :benefits :no backend code directly accessed using GET http request.
downside:no filtering ,ordering or any queries,you have to do it manually.

Options for storing and retrieving small objects (node.js), is a database necessary?

I am in the process of building my first live node.js web app. It contains a form that accepts data regarding my clients current stock. When submitted, an object is made and saved to an array of current stock. This stock is then permanently displayed on their website until the entry is modified or deleted.
It is unlikely that there will ever be more than 20 objects stored at any time and these will only be updated perhaps once a week. I am not sure if it is necessary to use MongoDB to store these, or whether there could be a simpler more appropriate alternative. Perhaps the objects could be stored to a JSON file instead? Or would this have too big an implication on page load times?
You could potentially store in a JSON file or even in a cache of sorts such as Redis but I still think MongoDB would be your best bet for a live site.
Storing something in a JSON file is not scalable so if you end up storing a lot more data than originally planned (this often happens) you may find you run out of storage on your server hard drive. Also if you end up scaling and putting your app behind a load balancer, then you will need to make sure there are matching copy's of that JSON file on each server. Further more, it is easy to run into race conditions when updating a JSON file. If two processes are trying to update the file at the same time, you are going to potentially lose data. Technically speaking, JSON file would work but it's not recommended.
Storing in memory (i.e.) Redis has similar implications that the data is only available on that one server. Also the data is not persistent, so if your server restarted for whatever reason, you'd lose what was stored in memory.
For all intents and purposes, MongoDB is your best bet.
The only way to know for sure is test it with a load test. But as you probably read html and js files from the file system when serving web pages anyway, the extra load of reading a few json files shouldn't be a problem.
If you want to go with simpler way i.e JSON file use nedb API which is plenty fast as well.

serve content from file vs database in node

I am making a new version of a old static website that grew up to a 50+ static pages.
So I made a JSON file with the old content so the new website can be more CMS (with templates for common pages) and so backend gets more DRY.
I wonder if I can serve that content to my views from the JSON or if I should have it in a MySQL database?
I am using Node.js, and in Node I can store that JSON file in memory so no file reading is done when user asks for data.
Is there a correct practise for this? are there performance differences between them serving a cached JSON file or via MySQL?
The file in question is about 400Kb. If the filesize is relevant to the choice of one tecnhology over the other?
Why add another layer of indirection? Just serve the views straight from JSON.
Normally, database is used for serving dynamic content that changes frequently, records have one-to-many or many-to-many relationships, and you need to query the data based on various criteria.
In the case you described, it looks like you will be OK with JSON file cached in server memory. Just make sure you update the cache whenever content of the file changes, i.e. by restarting the server, triggering cache update via http request or monitoring the file at the file system level.
Aside from that, you should consider caching of static files on the server and on the browser for better performance
Cache and Gzip static files (html,js,css,jpg) in server memory on startup. This can be easily done using npm package like connect-static
Use browser cache of the client by setting proper response headers. One way to do it is adding maxAge header on the Express route definition, i.e:
app.use "/bower", express.static("bower-components", {maxAge:
31536000})
Here is a good article about browser caching
If you are already storing your views as JSON and using Node, it may be worth considering using a MEAN stack (MongoDB, Express, Angular, Node):
http://meanjs.org/
http://mean.io/
This way you can code the whole thing in JS, including the document store in the MongoDB. I should point out I haven't used MEAN myself.
MySQL can store and serve JSON no problem, but as it doesn't parse it, it's very inflexible unless you split it out into components and indexing within the document is close to impossible.
Whether you 'should' do this depends entirely on your individual project and whether it is/how it is likely to evolve.
As you are implementing a new version (with CMS) of the website it would suggest that it is live and subject to growth or change and perhaps storing JSON in MySQL is storing up problems for the future. If it really is just one file, pulling from the file system and caching it in RAM is probably easier for now.
I have stored JSON in MySQL for our projects before, and in all but a few niche cases ended up splitting up the component data.
400KB is tiny. All the data will live in RAM, so I/O won't be an issue.
Dynamically building pages -- All the heavy hitters do that, if for no other reason than inserting ads. (I used to work in the bowels of such a company. There were million of pages live all the time; only a few were "static".)
Which CMS -- too many to choose from. Pick a couple that sound easy; then see if you can get comfortable with them. Then pick between them.
Linux/Windows; Apache/Tomcat/nginx; PHP/Perl/Java/VB. Again, your comfort level is an important criteria in this tiny web site; any of them can do the task.
Where might it go wrong? I'm sure you have hit web pages that are miserably slow to render. So, it is obviously possible to go the wrong direction. You are already switching gears; be prepared to switch gears a year or two from now if your decision turns out to be less than perfect.
Do avoid any CMS that is too heavy into EAV (key-value) schemas. They might work ok for 400KB of data, but they are ugly to scale.
Its a good practice to serve the json directly from the RAM itself if your data size will not grow in future. but if data is going to be increased in future then it will become a worst application case.
If you are not expecting to add (m)any new pages, I'd go for the simplest solution: read the JSON once into memory, then serve from memory. 400KB is very little memory.
No need to involve a database. Sure, you can do it, but it's overkill here.
I would recomend to generate static html content at build time(use grunt or ..). If you would like to apply the changes, trigger a build and generate static content and deploy it.

Storing image files in Mongo database, is it a good idea?

When working with mysql, it is a bad idea to store images as BLOB in the database, as it makes the database quite large which is harmful for normal usage of the database. Then, it is better to save image files on disk and save link to them within the database.
However, I think this is different for MongoDB, as increasing the database file size has a negligible influence on performance (this is the reason that MongoDB can successfully handle billions of records).
Do you think it is better to save image files on MongoDB (as GridFS) to reduce number of files stored on the server; or still it is better to keep the database as small as possible?
The problem isn't so much that the database gets big, databases can handle that (although MongoDB isn't as good as many other in that respect). The problem is that to send the data to the client it first has to be moved into RAM by the database, then copied over to the application's memory, then handed off to the kernel to be sent through the socket. It's wasting lots of RAM and CPU cycles. The reason it's better to have large files in the filesystem is that it's easier to get around copying it, you can ask the kernel to stream the file from disk to the socket directly.
The downside of storing large files in the filesystem is that it's much harder to distribute. Using a database, and something like Mongo's GridFS makes it possible to scale out. You just have to make sure you don't copy the whole file into the application's memory at once, but a chunk at a time. Most web app frameworks have some support for sending chunked HTTP responses nowadays.
The answer is yes. Back in the old cave-man days, servers had mutable file systems you could change. This was great till we tried to scale things.
Cave-people nowadays build apps with immutable deployments. Heroku and Dokku are examples of this. Because the web app server has no state, they can be created, upgraded, scaled, and destroyed easily.
Since we still have files, we need to put them somewhere. There are several solutions: nfs, our database, someone elses database.
nfs is a 'network file system' which let's you do file i/o on network resources. If you're dealing with the network anyways, IMHO it doesn't add much value unless it's what you know already.
Our database - For MongoDB there are two options: (file > 16mb) ? GridFS : BinData
Someone elses database - Some are basic like Amazon S3 and some offer extra services like Cloudinary or Dropbox.
If you're on an big-budget enterprise team and someone spends 40 hrs a week taking care of servers then sure - use the file system. If you're building web apps that scale, putting files in the DB makes sense.
If you're concerned about performance:
1) Using a proxy (e.g. nginx) or a CDN to host your content for clients. Your server should just be serving cache misses.
2) Use streaming IO Nodeschool has a cool tutorial for Node.js.
Storing images is not a good idea in any DB, because:
read/write to a DB is always slower than a filesystem
your DB backups grow to be huge and more time consuming
access to the files now requires going through your app and DB layers
The last two are the real killers.
Source: Three things you should never put in your database.
So if you can make your application crafty, then better not to upload your pictures to MongoDB.
However, if you are close to deadline... and the database will be so small that it will not grow up a lot and its size will never exceed the available RAM on the machine running your application, then I think (as opposed to the author of the cited article), you may consider storing the images in MongoDB. It's simply, convenient, quick to implement and gives you some flexibility.
MongoDB's GridFS is designed for this sort of storage and is quite handy for storing image files across many different servers in a way that all servers can use them.

Rails compressing data before saving to database

I need to store large amount of data into the database (MySQL). I would like to save disk space by compressing the text data before storing it to the database.
I know there will be a performance hit for compressing/decompressing data. But I am going to cache the decompressed data on CDN. And mostly, the data will not become stale for months or even years.
Can you please refer me some good compression/decompression techniques? I am also open to other alternatives than compressing/decompressing data.
If you want a pure MySQL solution, you could always try using the ARCHIVE storage type for your table. The documentation describes it as an insert-only, no update type of engine meant specifically for what you describe, stashing away things that won't change for years.
To do the same thing in a conventional engine would require using zlib on your data streams, but remember that compression performs very poorly on already compressed data such as most popular image types or video. You express your requirements as mostly text, which usually compresses quite well.
Ruby has Zlib::Deflate which can compress and expand data on demand. You could write your own wrapper similar to the JSON one by implementing the encode and decode methods on your module.
One thing to consider is you can probably store the compressed data on your CDN so long as you can be sure your client supports gzip encoding. I don't know of any major browsers that don't, as asset compression has become quite standard, especially in the mobile space.
If the data is really as static as you say then save the data as zipped xml files.
You can unzip them and zip them up again really easily as and when needed and in Rails generating an XML file is dead simple using SomeModel.to_xml the output of which can easily be sent to a file so maintaining them will be a simple too. You can just as eaily work this the other way round so that when it comes to reading the data in you can simply convert the data back into a model (Rails 3.x has ActiveModel which would be ideal for this scenario as the data is not backed by a database but you still get a ActiveRecord API and all the juice that AR gives you meaning that your views, controllers etc are working with a consistent api and consistent behaviour.
You have other options as well such as using ActiveResource but I wouldn't think that was necessary.
Not a recommended approach if you were not to be caching the data in the way you suggest (which is a neat solution BTW)