I'm building a demo, and find that I am storing lots of data in localStorage, almost constantly writing and reading values, parsing and stringifying JSON, etc. etc.
Do you find yourself relying on localStorage much?
What are the dangers of overusing it?
Are there dangers in regularly translating variables from and to JSON for storage?
It seems to me that if I did a db storage of lots of this data, I'd significantly increase the number and size of queries to my db.
The demo is very user focused, so I'm storing stuff like items the user has selected and input the user has provided. The data stored locally is only of value/interest to the user.
If you were building something like a todo list with scheduled end dates and reminders, would you use localStorage? Why or why not?
I know one limitation is that the user would only be able to view this content on one browser on one machine, but that isn't an issue for now.
You should really only write data to local storage that should persist across pages. If you're constantly parsing/stringifying JSON, then your code is likely much slower than it needs to be. Aside from reducing performance, there's a limited amount of space available in local storage, so you should use it judiciously.
Ask yourself: "How much of this data needs to stick around after the user leaves this page?"
Related
I want to create a website that will have an ajax search. It will fetch the data or from a JSON file or from a database.I do not know which technology to use to store the data. JSON file or MySQL. Based on some quick research it is gonna be about 60000 entries. So the file size if i use JSON will be around 30- 50 MB and if use MySQL will have 60000 rows. What are the limitations of each technique and what are the benefits?
Thank you
I can't seem to comment since I need 50 rep. for commenting, so I will give it as an answer:
MySQL will be preferable for many reasons, not the least of which being you do not want your web server process to have write access to the filesystem (except for possibly logging) because that is an easy way to get exploited.
Also, the MySQL team has put a lot of engineering effort into things such as replication, concurrent access to data, ACID compliance, and data integrity.
Imagine if, for instance, you add a new field that is required in whatever data structure you are storing. If you store in JSON files, you will have to have some process that opens each file, adds the field, then saves it. Compare this to the difficulty of using ALTER TABLE with a DEFAULT value for the field. (A bit of a contrived example, but how many hacks do you want to leave in your codebase for dealing with old data?) so to be really blunt about, MySQL is a database while JSON is not, so the correct answer is MySQL, without hesitation. JSON is just a language, and barely even that. JSON was never designed to handle anything like concurrent connections or any sort of data manipulation, since its own function is to represent data, not to manage it.
So go with MySQL for storing the data. Then you should use some programming language to read that database, and send that information as JSON, rather than actually storing anything in JSON.
If you store the data in files, whether in JSON format or anything else, you will have all sorts of problems that people have stopped worrying about since databases started being used for the same thing. Size limitations, locks, name it. It's good enough when you have one user, but the moment you add more of them, you'll start solving so many problems that you would probably end up by writing an entire database engine just to handle the files for you, while all along you could have simply used an actual database. Do note! Don't take my word for granted, I am not an expert on this field, so let others post their answer and then judge by that. I think enough people here on stackoverflow have more experience then I do haha. These are NOT entirely my words, but I have taken out the parts that were true from what I knew and know and added some of my own knowledge :) Have a great time making your website
For MySQl :you can select specific rows,or specific column using queries ,filter data based on a key,order alphabetically
downside:need a REST API to fetch data because it can't be accessed directly,you have to use php or python or whatever programming language for backend code.
for json file :benefits :no backend code directly accessed using GET http request.
downside:no filtering ,ordering or any queries,you have to do it manually.
I want to create a website that will have an ajax search. It will fetch the data or from a JSON file or from a database.I do not know which technology to use to store the data. JSON file or MySQL. Based on some quick research it is gonna be about 60000 entries. So the file size if i use JSON will be around 30- 50 MB and if use MySQL will have 60000 rows. What are the limitations of each technique and what are the benefits?
Thank you
I can't seem to comment since I need 50 rep. for commenting, so I will give it as an answer:
MySQL will be preferable for many reasons, not the least of which being you do not want your web server process to have write access to the filesystem (except for possibly logging) because that is an easy way to get exploited.
Also, the MySQL team has put a lot of engineering effort into things such as replication, concurrent access to data, ACID compliance, and data integrity.
Imagine if, for instance, you add a new field that is required in whatever data structure you are storing. If you store in JSON files, you will have to have some process that opens each file, adds the field, then saves it. Compare this to the difficulty of using ALTER TABLE with a DEFAULT value for the field. (A bit of a contrived example, but how many hacks do you want to leave in your codebase for dealing with old data?) so to be really blunt about, MySQL is a database while JSON is not, so the correct answer is MySQL, without hesitation. JSON is just a language, and barely even that. JSON was never designed to handle anything like concurrent connections or any sort of data manipulation, since its own function is to represent data, not to manage it.
So go with MySQL for storing the data. Then you should use some programming language to read that database, and send that information as JSON, rather than actually storing anything in JSON.
If you store the data in files, whether in JSON format or anything else, you will have all sorts of problems that people have stopped worrying about since databases started being used for the same thing. Size limitations, locks, name it. It's good enough when you have one user, but the moment you add more of them, you'll start solving so many problems that you would probably end up by writing an entire database engine just to handle the files for you, while all along you could have simply used an actual database. Do note! Don't take my word for granted, I am not an expert on this field, so let others post their answer and then judge by that. I think enough people here on stackoverflow have more experience then I do haha. These are NOT entirely my words, but I have taken out the parts that were true from what I knew and know and added some of my own knowledge :) Have a great time making your website
For MySQl :you can select specific rows,or specific column using queries ,filter data based on a key,order alphabetically
downside:need a REST API to fetch data because it can't be accessed directly,you have to use php or python or whatever programming language for backend code.
for json file :benefits :no backend code directly accessed using GET http request.
downside:no filtering ,ordering or any queries,you have to do it manually.
For example, consider something like Facebook or Twitter. All the user tweets / posts are retained indefinitely (so they must ultimately be stored within a static database). At the same time, they can rapidly change (e.g. with replies, likes, etc), so some sort of caching layer is necessary (e.g. you obviously can't be writing directly to the database every time a user "likes" a post).
In a case like this, how are the database / caching layers designed and implemented? How are they tied together?
For example, is it typical to begin by implementing the database in its entirety, and then add the caching layer afterword?
What about the other way around? In other words, begin by implementing the majority of functionality into the cache layer, and then write another layer which periodically flushes the cache to the database (at some point when its activity has gone down)? In this scenario, for current / rapidly changing data, the entire application would essentially be stored in cache.
Or perhaps implement some sort of cache-ranking algorithm based on access / update frequency?
How then should it be handled when a user accesses less frequent data (which isn't currently in cache)? Simply bypass cache completely / query the database directly, or should all data be cached before it's sent to users?
In cases like this, does it make sense to design the database schema with the caching layer in mind, or should it be designed independently?
I'm not necessarily asking for direct answers to all these questions, but they're just to give an idea of where I'm coming from.
I've found quite a bit of information / books on implementing the database, and implementing the caching layer independent of one another, but not a whole lot of information on using them in conjunction / tying them together.
Any information, suggestions, general patters, articles, books, would be much appreciated. It's just difficult to find some direction here.
Thanks
Probably not the best solution, but I worked on a personal project using Openresty where I used their shared memory zones to cache, to avoid the overhead of connecting to something like Redis, then used Redis as the backend DB.
When a user loads a resource, it checks the shared dict, if it misses then it loads it from Redis and writes it to the cache on the way back.
If a resource is created or updated, it's written to the cache, and also queued to a shared dict queue.
A background worker ticks away waiting for new items in the queue, writing them to Redis and then sending an event to other servers to either invalidate the resource in their cache if they have it, or even pre-cache it if needed.
I am working with a lot of separate data entries and unfortunately do not know SQL, so I need to know which is the faster method of storing data.
I have several hundred, if not in the thousands, individual files storing user data. In this case they are all lists of Strings and nothing else, so I have been listing them line by line as such, accessing the files as needed. Encryption is not necessary.
test
buyhome
foo
etc. (About 75 or so entries)
More recently I have learned how to use JSON and had this question: Would it be faster to leave these as individual files to read as necessary, or as a very large JSON file I can keep in memory?
In memory access will always be much faster than disk access, however if your in memory data is modified and the system crashes you will lose that data if it has not been saved to a form of persistent data storage.
Given the amount of data you say you are working with, you really should be using a database of some sort. Either drop everything and go learn some SQL (the basics are not that hard) or leverage what you know about JSON and look into a NoSQL database like MongoDB.
You will find that using the right tool for the job often saves you more time in the long run than trying to force the tool you currently have to work. Even if you need to invest some time upfront to learn something new.
First thing is: DO NOT keep data in memory. Unless you are creating portal like SO or Reddit, RAM as a storage is a bad idea.
Second thing is: reading a file is slow. Opening and closing a file is slow too. Try to keep number of files as low as possible.
If you are gonna use each and every of those files (key issue is EVERY), keep them together. If you will only need some of them, store them separately.
Whenever I'm to prepare a long form for the client I always want to split it into separate pages, so the visitor doesn't have to fill it all, but does it in steps.
Something like:
Step 1 > Step 2 > Step 3 > Thank You!
I've never done it for one reason: I don't know how to store the data from separate steps efficiently? By efficiently I mean, how to store it, so when a visitor decides not to finish it at Step 3 all the data is deleted.
I've come up with few ways of how this could be resolved, but I'm just not convinced by any of them:
Storing form data in database
I can imagine a table with columns representing each question, with final column representing a bool value whether the form has been completed or not?
But I would have to do a clean-up of the table every now and then (maybe even every time it gets updated with new data?) and delete all entries with complete = 0.
Store form data in session data.
This on the other hand, does not have to store data in database (depending on how sessions are being handled) and all info would be in Cookie. But what if browser doesn't support cookies or user disabled them (rare, but happens), or if form has file attachments, then this is a no-go.
echo'ing form data from previous page as <input type="hidden"> on the next page
Yes, I'm aware this is a rather stupid idea, but it's an alternative. Poor, but it is.
Option 1 seems to be the best, but I feel it's unnecessary to store temporary data in DB. And what if this becomes a very popular form, with a lot of visitors filling it in? The amount of Updates/Deletes could be massive?
I want to know how you deal with it.
Edit
David asked a good question. What technology I'm using?
I personally use PHP+MySQL, but I feel like it's more generic question. Please share your solutions no matter of what server-side technology you use, as I'm sure the concept can be adapted one way or the other to different technologies.
I think the choice between options 1 and 2 comes down to how much data you are storing. I think in most cases the amount of data you are collecting on a form is going to be fairly small (a few kilobytes). In that case, I think storing it in the session data is the way to go. There's not that much overhead in passing that amount of data back and forth. Also, unless your users are on computers where there is a strict security policy in place, the application should work. If you make the page requirements clear users can decide if they want to proceed or not.
If you are storing large amounts of data for the form then a database would be better so you don't need to pass the data back and forth. However, I think this is going to be a fairly rare situation. Even if the application allows the uploading of files you can save those to a temporary location and only write them to the database once the form is completed. The other situation where you might want to use a database is if your form needs to be able to support the user leaving and coming back at a later time to resume the form.
I agree that option 1 is the best, because it has a few benefits over the other 2:
If the data is persisted, users can come back later and continue the process
Your code base will be much cleaner with incremental saves, and it alleviates the need for 1 massive save operation
Your foot print (each page request) will be lighter than option 3
If you're worried about performance, you can queue the data to be saved, since it's not necessary to save it near-real-time.
Edit to clear up a misconception: The data inside PHP Sessions, by default, are NOT stored in Cookies and are capable of storing a lot of data without too much overhead.
I'd go with number 2, but use the cookie only for identifying the user. The session data should actually be stored on your server and the cookie merely provides a lookup key to the session object that contains all the details.
If the site becomes popular and needs to run on more than a single web server, then your session data will need to be persisted in some kind of database anyway. In that case you would need a database that could handle massive amounts of transactions.
Note: I agree that this is a platform independent question. Stack Overflow users prefer to see code in questions and prefer to give code in answers, so that's why I normally ask what language someone is using.
To be brutally honest, just use the database as in option 1 and stop worring about data volumes. Seriously if your site is that successful that it becomes a problem then you ought be able fund a re-vamp to cope.
There's nothing wrong with taking the POST data from the previous step and adding hidden input elements. Just take all the POST data from the previous page that you care about and get them into the current page's form. This way, you don't have to worry about using persistent storage in any form, whether it's on the client side or the server side.
What are the perceived downsides? That there are a lot of extra elements on the page? Not that the user sees. All you have to do is add an element for each input you ask the user to give (on every page, if you want the user to be able to go back). Besides these elements, which don't give any visual clutter, there's nothing extra.
There's also the fact that all the form data will have to be transmitted on every page load. Sure, but this is probably going to be faster than a lookup in a database, and you don't have to worry about getting rid of stale data.