d3.json() to load large file - json

I have a 96mb .json file
It has been filtered to only the content needed
There is no index
Binaries have been created where possible
The file needs to be served all at one time to calculate summary statistics from the start.
The site: https://3milychu.github.io/met-erials/
How could I improve performance and speed and/or convert the .json file to a compressed file that can be read client-side in javascript?

Most visitors will not hang around for the page to load -- I thought that the demo was broken when I first visited the site. A few ideas:
JSON is not a compact data format as the tag names get repeated in every datum. CSV/TSV is much better in that respect as the headers only appear once, at the top of the file.
On the other hand, repetitive data compresses well, so you could set up your server to compress your JSON data (e.g. using mod_deflate on Apache or compression on nginx ) and serve it as a gzipped file that will be decompressed by the user's browser. You can experiment to see what combination of file formats and compression works best.
Do the summary stats need to be calculated every single time the page loads? When working with huge datasets in the past, summary data was generated by a daily cron job so users didn't have to wait for the queries to be performed. From user feedback, and my own experience as a user, summary stats are only of passing interest, and you are likely to lose more users by making them wait for an interface to load than you are through not providing summary stats or sending stats that are very slightly out of date.
Depending on how your interface / app is structured, it might also make sense to split your massive file into segments for each category / material type, and load the categories on demand, rather than making the user wait for the whole lot to download.
There are numerous other ways to improve the load time and (perceived) performance of the page -- e.g. bundle up your CSS and your JS files and serve them each as a single file; consider using image sprites to reduce the number of separate requests that the page makes; serve your resources compressed wherever possible; move the JS loading out of the document head and to the foot of the HTML page so it isn't blocking the page contents from loading; lazy-load JS libraries as required; etc., etc.

Related

Many html files (One per page) or a huge one containing multiple pages

I want to know if I should have one html file per url(home,register,login,contact) i got more than 50 or should i separate them into like 5 files and get them through ?id=1,2,3,4,5,6 etc.
I want to know which method is more convenient , anyway I have understood that the second method would have to load the whole file which will be more slower than loading a single file.
But loading a single file will require more petitions and request to and from the server and the whole html files will be heavier due to i have to write a head and include all the files for each one of them
In past experience, I make sure that any components with distinct functionality is placed in its own file. I would consider distinct functionality as the examples that you listed above (home, register, login, contact, etc). On the other hand, if you are managing blog posts (or something similar), I would definitely use GET requests (i.e. ?page=1,2,3).
I have also maintained websites that have about 50-100 different pages, but it did use a content management system. If you feel overwhelmed, this could also be a possibility to explore.
If you do not choose to use a cms, I would recommend you to use partial files. A good example for a partial would be a header or footer. By using partials, you no longer need to replicate the same code on multiple pages (say goodbye to creating 50 navbars).

What is the most efficient way to display lots of data on a website?

I have an optimization question.
Lets say that I'm making a website, and it has a JSON file with 5,000 pairs (about 582 kb) and through the combination of 3 sliders and some select tags it is possible to display every value. So the time to appear between every pair is in microseconds.
My question is: If the website is also made to run on mobile browsers, where is it more efficient to have the 5000 pairs of data - in a JSON file or in the data base? and why?
I am building a photo site with similar requirements and I can say after months of investigations and experimenting that there are no easy answer to that question. But I will try to give you some hints:
Try to divide the data in chunks, for example - if your sliders are selecting values between 1 through 100, instead of delivering exactly what the client selected, round up a bit maybe +-10 or maybe more, that way you can continue filtering on the client side without a server roundtrip. Save all data in client memory before querying.
Don't render more than what is visible on the screen, JSON storage and filtering is fast but DOM is very slow, minimize the visible elements.
Use 304 caching - meaning - whenever the client is requesting the same data twice; send a proper 304 response with etag. For example - a good rule of thumb here is to use something you know very easily, like the max ID in the database or so to see if any new data has been updated since the last call. If not, just send 304 and the client will use whatever he had the last time.
Use absolute positioning. Don't even try to use the CSS float or something like that, it will not work. Just calculate each position of each element. This will also help you to achieve tip nr 2 (by filtering out all elements that are outside of the visible screen). You can still use CSS transitions which gives nice animations when they change sliders.
You could experiment with IndexedDB to help with the client side querying but unfortunately the support in different browsers are still not good enough plus you hit the roof on storage, better to use the ordinary cache and with proper headings.
Good Luck!
A database like MongoDB would be good for this. It still uses the JSON syntax for storage so you can use the values from the JSON file. The querying is very fast too and you wouldn't have to parse the JSON file and store it in an object before using it.
Given the size of the data (just 582Kb) I will opt for the Json file.
The drawback is you will have a penalty starting the app and loading the data in memory, but then all queries will run very fast in memory as a good advantage.
You need to think about how much acceses will your app do to the database (how many queries) against load the file just once. And think if your main objective are mobile browsers or pcs.
For this volume of data I wouldn't try a database (another process consuming resources), just try how much resources (time, memory) are needed to load the JSON file.
If the data is going to grow... then you will need to rethink this, or maybe split your json file following some criteria.

Multiple css files or one big css file?

Which one is better and faster? why?
using only one file for styling:
css/style.css
or
using several files for styling:
css/header.css
css/contact.css
css/footer.css
css/tooltip.css
The reason Im asking it is that im developing a site for users who have very low internet speed. country uganda. So I want to make it as fast as possible.
Using a single file is faster because it requires less HTTP requests (assuming the amount of styles loaded is still the same).
So it's better to keep it in just one file.
Separating CSS should only be done if you want to keep for example IE specific classes separate.
As per Yahoo's Performance Rules [source], It is VERY IMPORTANT to minimize HTTP requests
From the source
Combined files are a way to reduce the number of HTTP requests by combining all scripts into a single script, and similarly combining
all CSS into a single stylesheet. Combining files is more challenging
when the scripts and stylesheets vary from page to page, but making
this part of your release process improves response times.
It is quite uneasy to develop using combined files, so stick to developing with multiple files but you should combine the files once you are deploying the system on the web.
I really recommend using boilerplate's ant build script. You can find it here.
It Combines and minifies CSS
One css file is better than multiple css files because of the overhead involved by the end user's browser to make multiple requests for each file. Other things you can do yo improve the performance include:
Enable gzip impression on your webserver e.g. on Apache so that the files are compressed before downloading
where possible host your files geographically as close to the majority of your end users as possible
use a CDN network for your static content such as css files
Use CSS sprites
Cache your content
Note that there are tools available to help you do this. See 15 ways to optimise css for more information
This is always a better solution to bundle or combine multiple CSS or JavaScript files into fewer HTTP requests. This causes the browser to request a lot fewer files and in turn reduces the time it takes to fetch them.
With a proper caching, you can gain extra bandwidth and even fewer HTTP request.
Update:
There's a new Bundling feature in ASP.Net 4.5 which you might be interested in.
This allows you to have css files separated at compile-time, and in runtime gain benefit of combined resources into one resource
One resource file is always the fastest approach since you reduce the number of HTTP requests made to fetch those files.
I would suggest to use Yslow which is a great extension for firebug that analyzes web pages and suggests ways to improve their performance.

Why people always encourage single js for a website?

I read some website development materials on the Web and every time a person is asking for the organization of a website's js, css, html and php files, people suggest single js for the whole website. And the argument is the speed.
I clearly understand the fewer request there is, the faster the page is responded. But I never understand the single js argument. Suppose you have 10 webpages and each webpage needs a js function to manipulate the dom objects on it. Putting 10 functions in a single js and let that js execute on every single webpage, 9 out of 10 functions are doing useless work. There is CPU time wasting on searching for non-existing dom objects.
I know that CPU time on individual client machine is very trivial comparing to bandwidth on single server machine. I am not saying that you should have many js files on a single webpage. But I don't see anything go wrong if every webpage refers to 1 to 3 js files and those js files are cached in client machine. There are many good ways to do caching. For example, you can use expire date or you can include version number in your js file name. Comparing to mess the functionality in a big js file for all needs of many webpages of a website, I far more prefer split js code into smaller files.
Any criticism/agreement on my argument? Am I wrong? Thank you for your suggestion.
A function does 0 work unless called. So 9 empty functions are 0 work, just a little exact space.
A client only has to make 1 request to download 1 big JS file, then it is cached on every other page load. Less work than making a small request on every single page.
I'll give you the answer I always give: it depends.
Combining everything into one file has many great benefits, including:
less network traffic - you might be retrieving one file, but you're sending/receiving multiple packets and each transaction has a series of SYN, SYN-ACK, and ACK messages sent across TCP. A large majority of the transfer time is establishing the session and there is a lot of overhead in the packet headers.
one location/manageability - although you may only have a few files, it's easy for functions (and class objects) to grow between versions. When you do the multiple file approach sometimes functions from one file call functions/objects from another file (ex. ajax in one file, then arithmetic functions in another - your arithmetic functions might grow to need to call the ajax and have a certain variable type returned). What ends up happening is that your set of files needs to be seen as one version, rather than each file being it's own version. Things get hairy down the road if you don't have good management in place and it's easy to fall out of line with Javascript files, which are always changing. Having one file makes it easy to manage the version between each of your pages across your (1 to many) websites.
Other topics to consider:
dormant code - you might think that the uncalled functions are potentially reducing performance by taking up space in memory and you'd be right, however this performance is so so so so minuscule, that it doesn't matter. Functions are indexed in memory and while the index table may increase, it's super trivial when dealing with small projects, especially given the hardware today.
memory leaks - this is probably the largest reason why you wouldn't want to combine all the code, however this is such a small issue given the amount of memory in systems today and the better garbage collection browsers have. Also, this is something that you, as a programmer, have the ability to control. Quality code leads to less problems like this.
Why it depends?
While it's easy to say throw all your code into one file, that would be wrong. It depends on how large your code is, how many functions, who maintains it, etc. Surely you wouldn't pack your locally written functions into the JQuery package and you may have different programmers that maintain different blocks of code - it depends on your setup.
It also depends on size. Some programmers embed the encoded images as ASCII in their files to reduce the number of files sent. These can bloat files. Surely you don't want to package everything into 1 50MB file. Especially if there are core functions that are needed for the page to load.
So to bring my response to a close, we'd need more information about your setup because it depends. Surely 3 files is acceptable regardless of size, combining where you would see fit. It probably wouldn't really hurt network traffic, but 50 files is more unreasonable. I use the hand rule (no more than 5), but surely you'll see a benefit combining those 5 1KB files into 1 5KB file.
Two reasons that I can think of:
Less network latency. Each .js requires another request/response to the server it's downloaded from.
More bytes on the wire and more memory. If it's a single file you can strip out unnecessary characters and minify the whole thing.
The Javascript should be designed so that the extra functions don't execute at all unless they're needed.
For example, you can define a set of functions in your script but only call them in (very short) inline <script> blocks in the pages themselves.
My line of thought is that you have less requests. When you make request in the header of the page it stalls the output of the rest of the page. The user agent cannot render the rest of the page until the javascript files have been obtained. Also javascript files download sycronously, they queue up instead of pull at once (at least that is the theory).

Website files caching?

I want to know how long certain files like css, html and js are desirable to be cached by .htaccess setting and why different time setting for each file type?
In few examples i saw that someone cache html for 10 mins, js for a month and imagery for a year.
I think it depends how often a resource is updated. You HTML content is probably dynamic, so you can't cache it for a long time. Otherwise a visitor sees the changes after a long delay.
On the other side, pictures are rarely updated, so you can set longer cache time.
The JavaScript files are often updated for new features or bugfixes. Maybe you can use a version number for this files (core.js?v=12323) so that you can change the number in your HTML content to get them refreshed by a visitor. This way you can cache them for a longer time as well.