I want to create a website that will have an ajax search. It will fetch the data or from a JSON file or from a database.I do not know which technology to use to store the data. JSON file or MySQL. Based on some quick research it is gonna be about 60000 entries. So the file size if i use JSON will be around 30- 50 MB and if use MySQL will have 60000 rows. What are the limitations of each technique and what are the benefits?
Thank you
I can't seem to comment since I need 50 rep. for commenting, so I will give it as an answer:
MySQL will be preferable for many reasons, not the least of which being you do not want your web server process to have write access to the filesystem (except for possibly logging) because that is an easy way to get exploited.
Also, the MySQL team has put a lot of engineering effort into things such as replication, concurrent access to data, ACID compliance, and data integrity.
Imagine if, for instance, you add a new field that is required in whatever data structure you are storing. If you store in JSON files, you will have to have some process that opens each file, adds the field, then saves it. Compare this to the difficulty of using ALTER TABLE with a DEFAULT value for the field. (A bit of a contrived example, but how many hacks do you want to leave in your codebase for dealing with old data?) so to be really blunt about, MySQL is a database while JSON is not, so the correct answer is MySQL, without hesitation. JSON is just a language, and barely even that. JSON was never designed to handle anything like concurrent connections or any sort of data manipulation, since its own function is to represent data, not to manage it.
So go with MySQL for storing the data. Then you should use some programming language to read that database, and send that information as JSON, rather than actually storing anything in JSON.
If you store the data in files, whether in JSON format or anything else, you will have all sorts of problems that people have stopped worrying about since databases started being used for the same thing. Size limitations, locks, name it. It's good enough when you have one user, but the moment you add more of them, you'll start solving so many problems that you would probably end up by writing an entire database engine just to handle the files for you, while all along you could have simply used an actual database. Do note! Don't take my word for granted, I am not an expert on this field, so let others post their answer and then judge by that. I think enough people here on stackoverflow have more experience then I do haha. These are NOT entirely my words, but I have taken out the parts that were true from what I knew and know and added some of my own knowledge :) Have a great time making your website
For MySQl :you can select specific rows,or specific column using queries ,filter data based on a key,order alphabetically
downside:need a REST API to fetch data because it can't be accessed directly,you have to use php or python or whatever programming language for backend code.
for json file :benefits :no backend code directly accessed using GET http request.
downside:no filtering ,ordering or any queries,you have to do it manually.
Related
I want to create a website that will have an ajax search. It will fetch the data or from a JSON file or from a database.I do not know which technology to use to store the data. JSON file or MySQL. Based on some quick research it is gonna be about 60000 entries. So the file size if i use JSON will be around 30- 50 MB and if use MySQL will have 60000 rows. What are the limitations of each technique and what are the benefits?
Thank you
I can't seem to comment since I need 50 rep. for commenting, so I will give it as an answer:
MySQL will be preferable for many reasons, not the least of which being you do not want your web server process to have write access to the filesystem (except for possibly logging) because that is an easy way to get exploited.
Also, the MySQL team has put a lot of engineering effort into things such as replication, concurrent access to data, ACID compliance, and data integrity.
Imagine if, for instance, you add a new field that is required in whatever data structure you are storing. If you store in JSON files, you will have to have some process that opens each file, adds the field, then saves it. Compare this to the difficulty of using ALTER TABLE with a DEFAULT value for the field. (A bit of a contrived example, but how many hacks do you want to leave in your codebase for dealing with old data?) so to be really blunt about, MySQL is a database while JSON is not, so the correct answer is MySQL, without hesitation. JSON is just a language, and barely even that. JSON was never designed to handle anything like concurrent connections or any sort of data manipulation, since its own function is to represent data, not to manage it.
So go with MySQL for storing the data. Then you should use some programming language to read that database, and send that information as JSON, rather than actually storing anything in JSON.
If you store the data in files, whether in JSON format or anything else, you will have all sorts of problems that people have stopped worrying about since databases started being used for the same thing. Size limitations, locks, name it. It's good enough when you have one user, but the moment you add more of them, you'll start solving so many problems that you would probably end up by writing an entire database engine just to handle the files for you, while all along you could have simply used an actual database. Do note! Don't take my word for granted, I am not an expert on this field, so let others post their answer and then judge by that. I think enough people here on stackoverflow have more experience then I do haha. These are NOT entirely my words, but I have taken out the parts that were true from what I knew and know and added some of my own knowledge :) Have a great time making your website
For MySQl :you can select specific rows,or specific column using queries ,filter data based on a key,order alphabetically
downside:need a REST API to fetch data because it can't be accessed directly,you have to use php or python or whatever programming language for backend code.
for json file :benefits :no backend code directly accessed using GET http request.
downside:no filtering ,ordering or any queries,you have to do it manually.
I am currently working on an application like to analitics, i has Angularjs app which communicates with Spring REST Client App from which user creates token(trackingID) and use generated script with this id putting on his website to collect information about visitor's actions through another Spring REST tracking App, for tracking app i am using as mongodb to collect visitor actions/visitor info for fast insertion, but for rest client app mysql with user/accounts details.
My question is how to migrate mongo data from tracking app to mysql maybe for getting posibility of join for easily and fastest way of analyze data with any kind of filters from angularjs client app, to create manually any workers that periodically will transfer data from last point to present state from mongo to mysql, or are any existed tools that can be setted for this transfer?
There is no official library to do this.
But you can use mongoexport feature from mongoDB to export it in a CSV format and mysqlimport to import them into MySQL.
Here are links to the documentation MySQL import and MongoDB Export.
One more method you can try to write a program in one of your favorite language and read from MongoDB and write into MySQL
MySQL 5.7 has a new JSON data type, that can be very convenient.
You can create a table at MySQL to receive the JSON messages AS IS, and then use SQL to query it or do a post processing to load the data in a structured set of database tables.
Check this out: https://dev.mysql.com/doc/refman/5.7/en/json.html
I realise this question is a few years old - but recently I've had a number of people enquiring whether a tool I developed (https://virtual.blue/apps/json-converter) can do exactly what the OP is asking (convert MongoDB to SQL) so I am guessing it is still something people want. Keep reading to find out why I am honestly not surprised by this.
The short answer to whether the tool can help you is: perhaps. If your existing data relationships are not too complicated, and your database is not enormous, it may well be worth a try.
However, I thought it might help to try and explain what the issues are with this kind of conversion, since all the answers I have seen so far are along the lines of "try tool X" or "first convert to format Y and then you can slurp it into MySQL using utility Z". ie there is no thought to whether what you get at the end of doing this is going to make sense in terms of data relationships and integrity.
For example, you could just stick your entire database dump in a single field of a single SQL table (ok space limitations might prevent this in reality, but hopefully you get my point). Then your database would be "in MySQL format", but it would be absolutely no use to anyone.
The point is, what you actually want is a fully defined database model, correctly encapsulating all of the intrinsic data relationships. ("Database normalization" as it is known.) If your conversion process gets those relationships wrong, then you have a broken model, and any queries you try to run over it are likely to return nonsense. Unfortunately there is no magic tool that is just going to "know" the best way to represent your data in MySQL, and closing your eyes and shovelling it into a bunch of random tools is unlikely to miraculously get you what you want.
And herein lies the fundamental problem with the "NoSQL" philosophy (fad). They sold people the bogus notion of "non-relational data". My first thought when I heard this was, "How does that work? Surely all data is relational?" By the looks of things we are steadily getting more and more evidence that my instincts were right. ("NoSQL? Why stop there? I go with 'NoDatabase'. It returns no results at all, but it sure is fast!")
The NoSQL madness throws several important fundamental engineering principles to the wind. We shouted "don't hard code!", "DRY!" (Don't Repeat Yourself) because these actions infuse inflexibility into systems. Traditional wisdom makes precisely the same flexibility argument when it advises "create a fully described model with all the data relationships represented". Then you can execute any arbitrary query over it and expect meaningful results. "Yes but there are a whole bunch of queries we are never going to need to run," says the NoSQL proponent. But surely we learnt our lesson on things we are "never going to need to do"? ("I hard code liberally, because I know I am never going to want to change my code." Hmm...)
The arguments about speed are largely moot. Say it turns out you are frequently doing a complex 9 table join, with unsurprisingly sluggish performance. So create an index. Cache it. Swap some disk space for speed. The NoSQL philosophy is to swap data integrity for speed, which makes no sense at all.
When you generate your fast lookup index (cache/table/map/whatever) what you are really doing is creating a view over your model. If your model changes, you can readily update your view. Going from a model to a view is easy - it's a one to many operation and you are on the right side of entropy.
However, when you went with MongoDB you effectively decided to create views without bothering to describe your fundamental model. Now you discover there are queries you want to run, but can't - and so it's no wonder you want to move over to SQL and actually have your data modelled correctly. The problem is you now want to go from a view to a model. Now you're on the wrong side of entropy. Your view is a lossy representation of the model's fundamental relationships. You can't expect a tool to "translate" your database, because you are asking it to insert new relationships which were not originally defined. These are real world relationships that are not machine-guessable. The tool cannot know what relationships were intended.
In short the only way you can do this reliably is to get your hands dirty. An intelligent human, with complete understanding of the system you are modelling needs to sit down and carefully come up with (possibly a substantial amount of) code which effectively picks through the data and resolves all of the insufficiently represented data relationships. If your data is complex then it's going to be a headache and there is no way to cheat.
If your data is still relatively simple then I would suggest making the conversion as soon as possible, before it becomes difficult. In this case my tool (https://virtual.blue/apps/json-converter) may be able to help.
(They really should have asked a Physicist before they came up with all this nonsense...!)
You can download a trial version of Studio 3T for Mongo and export your database to SQL (or JSON) directly
I have never used NoSQL before, generally the applications I write requires relations. However, I have encountered a something that I don't know how to go about. So far, I am only designing the database. For now, my main logic is in the MySQL Database. I have static content that I will be hosting through a CDN. However, I have dynamic content that will be updated but very rarely but will be read almost on every request - like phone number, email address, address, additional info. They will not be used for searching, however this data is unstructured. A user can have multiple email addresses, phone numbers, and addresses; and they would be needed for multiple tables. So, using relational database in this case fails my needs (I don't want to create an Entity-Attribute-Value Table for this) and since I know that it doesn't affect the logic - its only used as a "meta-data" I want to keep them in a JSON format. And after Google-ing for sometime, I found out that MongoDB stores "documents" in JSON which sounded like the perfect solution. However, I have one question regarding this. How do I connect these databases together? Do I need to just add a user_id or organization_id "column"/field for a document on create/update and do a "select" query (whatever is the equivalent in the MongoDB) to receive the meta data? Or is there a different way?
I'll present here my opinion. What you're trying to do here is called "polyglot persistence". If you introduce mongo, you'll have 2 architectures for storing your data, different in strength, api, design, what not and this has its price.
Mongo DB, is a great product, I've used it by myself with a great success, but you have to understand that it doesn't provide all the features you would expect from RDBMS like MySQL. For example, it totally lacks transactions.
Moreover if you store in both MySQL and Mongo you'll have to care for data integrity by yourself (what happens if as a part of logical transaction mysql transaction succeeds, but mongo fails to store the data), there is no rollback...
I believe you've got my point.
Yes, mongo really allows to query by various JSON parameters, in fact it features the whole query language, it resembles SQL to some extent, but its not really a "relational" query engine, because mongo is not a relational database, so you don't have JOINs for example. But you've said by yourself that you are not going search by these fields, so I kind of don't understand what benefit you would have from using mongo. Maybe this is only about terminology, but I'm confused with this statement a little.
Where mongo is really shines is when you have a lot of data (its a big data product after all), then you have funny stuff like replica-sets and sharding, but the question is whether you really need it? do you really have "big data" - really huge amount of objects to be stored?
As an alternative, I think maybe you can use a text column for storing the JSON "as is". I mean, you might have a column, storing the JSON.
You even sometimes have "JSON" type as a native type in the database, I'm not sure whether MySQL supports it.
In this case you even can do some operations on these jsons (like, append, partial update and so forth).
Of course the choice is yours, all I'm saying is that you should think whether you have more benefits while using 2 persistence engines, or will make your project more complicated.
Hope this helps
Altough its very easy to do a search about the topic, it's not as easy to come to a conclusion. What are some cons of storing html in a database for use?
HTML is static, and querying the data from a database uses database resources; database resources are typically among the more restricted on moderate to heavy use systems, therefore it makes sense to not store HTML in the database, but to place it on the filesystem, where it can be retrieved without using critical resources.
In the broadest sense, HTML is a document markup language and serves to structure data into a document. The database on the other hand should contain raw data organized along its logical relations. Documents use formatting and may present data redundantly, but the true, underlying data is always fixed. Thus you should store the most immediate, raw form of data that you possibly can, and retrieve it in meaningful ways using both the query language itself to create suitable views for your purposes, and other, output-specific data processing to generate documents.
Of course you may like to cache the result of an output formatting operation, and you may choose to store the cache in a database, too. That's fine of course. But concerning the raw payload data, I would always go for the above.
That depends on the use of the HTML in the database. If it's data that you only ever access as a blob (meaning you never/rarely query the contents of the HTML), then I think it can be a good idea in some cases. Then the question is essentially the same as "Should I store files in xyz format in my database?" And the answer to questions like that depends on several things:
How large are the files? Would storing them on the filesystem, with just their filename/path in the DB be more efficient?
Do you need to replicate the data to other servers? If so, then storing raw files in the DB may be easier than on the FS, if you already have DB-sync infrastructure in place.
What are your query uses like? Are they friendlier to a DB or a file system storage?
Now, if you're talking about storing HTML data that you frequently have to query, that changes the game entirely.
Any database normalization nazi would tell you never to do it. But there might be cases when it's useful. For instance, if you're using some sort of full-text searching engine, you may want that in a database--or in whatever form the full-text search engine uses.
I'm working on a Wordpress theme and it needs to have some settings. Until now I have them in the database but I also want them to be portable. For that reason I also saved the data as an array into a settings.php-file. Now I'm considering to don't use the database at all to avoid storing things twice.
Some questions about this
Are there any bad thing about storing data in an array within a included PHP-file, compared to MySQL database? (It's just settings, nothing needs to be sorted, no relations needed)
Which is fastest? Include a php-file with an array, or load data from the database?
Other thougts about this?
I give a check-vote to the most complete answer to my questions. Short answers might get a vote up.
I'm a pretty hardcore database guy and I would say they do not need to be in a database.
The clues are in your statements "It's just settings, nothing needs to be sorted, no relations needed" and "I also want them to be portable"
My main argument is simplicity. PHP is extremely good with arrays, it likes them, it understands them, can easily load them from files and save them to files. So even if you do change them from time to time from the app, updating and saving an array is no big deal. So, if you use the array, you use a native feature of PHP and that creates architectural simplicity for this feature.
So for portability, the most portable database is the one you do not use. When you have the simplicity of using a native PHP data format, you don't need the database (at least not for this)
For speed, on Linux anyway PHP can open a file and read it faster than it can make a roundtrip to the database for anything.
The only remaining argument against an array solution would be interaction with other data, but you have said there is none.
So, as a hardcore database guy, I would say do not use the database just because it is there. Databases are incredible for structured data, if this is just a flat list of settings it is not structured data. The db can do, the filesystem can do it, pick what is simpler.
Don't create a file based storage for your Wordpress-Theme-Settings if you want it to be portable. Some sites might have the themes folder readonly.
For the initial setup it's ok to read your settings from almost everything filebased. Later on its best stored (and backuped) in the database.
If you store things in a file use the ini-style based files as PHP gives you an API for free. Things usually only tend to be better stored in an array or serialized if your options are not restricted.
Don't care about performance too much, simply use the wordpress options api as a best practice.
If you're worried about portability, you might be interested in using ODBC or PHP Data Objects
As for which is fastest, I'm no expert, but the settings file only involves reading a text file and parsing it. The database option usually will result in TCP connections (unless you use mysqlite, which I would recommend if you are going to store more than just file paths and database names.
Regards,
The main problem of having them in a file occurs if you want the settings to be configurable via the website itself. If not, then having them in the code is no real cause for concern.