Minimize size of HTML file

Minimize size of HTML file - html

I have a large HTML file being generated for a report at the moment (around 2-3 mb) and this file is going to be transferred a lot of times. It is not being access through any form of a web host, it is just a file being accessed by a network, but the network is all around the world and therefore not fast everywhere.
I know about gzip compression, but from the looks of it that will only work with an apache web server or something similar to configure it via the .htaccess file. I have already stripped the white spaces from the HTML file, my question is besides just zipping it up in a standard archive, what else can I do to minimize the size of the file?
Thanks, and I will be happy to answer any other questions.

You can certainly look at the HTML structure itself to see if you can reduce the number of tags themselves. For example to you have a bunch of nested table structures that could be replaced? Do you have inline styles that could be put into a separate stylesheet? Do you have any javascript content which could be put into a separate file?

I does not think that you can compress it without a proper web server, because is the web server that say to the browser that the file is to unzip in the HTTP response.
If the format is the greater part of the file (i.e. there are more tags and script than the text) you can use a css to minimize the size.
If the data is the greater, so information are the more than tags, I suggest you to use a web server (also with the Microsoft IIS you can compress it)
But, if possible, consider also to split the data in several file, with different level of details for example

It is possible to contain compressed data within the HTML file and use a JavaScript to dynamically compress the data as the page is rendered using a JavaScript implementation of the Decompression module. See this answer for references: JavaScript implementation of Gzip

Related

embed data or not what are best practices for serving/parsing dynamic content

I am starting to use go for serving dynamic html content, parsing templates, replace variables, etc. so far all good, I found that I could create a single binary and deploy a single file including all the static files by using packages like go-bindata.
But when it comes to performance what are the best practices to follow?
If I am right, having a single binary with all the static content embedded will result in a bigger file in size.
Having a binary that needs/depends to parse the templates (*.tpl) only at at startup maybe smaller in size, but will need to be shipped with all the static content.
If space is the only difference, having a single binary looks like the more comfortable way to go for some cases, but not been an expert on the topic, I would like to know some best practices to follow keeping an eye on performance.

I you add something like
var templates = template.Must(template.ParseGlob("templates/*.html"))
in global scope, then they are parsed only on startup.
If you upload and run your app on some server, then probably having separate files is more convenient because then you can use rsync to avoid uploading file which didn't change since last upload.
Putting everything to one file can make things easier if you want to distribute only one executable for download.

The simpliest way to store data on pc

I'm creating a page for myself that could be accessed without internet connection (local storage only).
I want that page to somehow store data (that I put in the website) on my computer.
I've heard there are ways to edit .txt files with a help of php?
Also maybe Chrome could somehow save that info easier?
Appreciate any help
EDIT: I want a fast and easy access to a website via Chrome only, so I prefer not to be using XAMPP or any other software.

The easiest way would be to use HTML5's localStorage (no server-side languages needed), but it won't be easy to get that data outside of your page (I understood you'll be using that offline page which has stored data).
It's as simple as:
window.localStorage.setItem('myItem', 'Hello World');
And then to get it, you'd just do:
window.localStorage.getItem('myItem');
Array approach works as well (localStorage.myItem, etc.).
Read more about it here and here.
Here is a simple example from above: http://jsfiddle.net/h6nz1Lq6/
Notice how the text remains even after you remove the setter line and rerun the script (or just go to this link: http://jsfiddle.net/h6nz1Lq6/1/).
The downside of this approach is that the data can easily be cleared by accident (by clearing browser/website data, but again this is similar to accidental deleting of a file, so nothing to be afraid of if you know what you're doing) and that it doesn't work across browsers (each browser stores its own localStorage).
If you still decide to use a server-side language, there are millions of tutorials about them. For a beginner, it would probably be the easiest to use a simple PHP script to write a file, but that would require using a server on your machine.

PHP example:
<?php
$file = fopen("test.txt","w");
echo fwrite($file,"Hello World. Testing!");
fclose($file);
?>
Taken from http://www.w3schools.com/php/func_filesystem_fwrite.asp

You can read and write directly to storage with PHP or use a database for i/o. Check in PHP+MySQL for a common solution and use file upload with HTML or textarea field for plain text.

individual JS file XMLHttpRequest vs combined gzip download

some stats before i can state the situation,
total JS code = 122 MB
minified = 36 MB
minified and gzip = 4 MB
I would like to get the entire 4 MB down in one shot (with a loading progress indicator on the page), uncompress them, but not parse them yet. We don't want the code expanding in browsers memory when a lot of it might not be required at this point. The parsing should happen when a script tag with the corresponding js file name is encountered.
Intention: faster one shot download of js files, but keeping the behaviour unchanged from the browser perspective.
Do any such solutions exist? Am I even thinking sane?
If yes, I know how to get the gzip, I would like to know how to keep them in the browser cache so that when a script tag is encountered the browser doesn't fire a XMLHttpRequest for it again.

The trick is to leverage HTTP caching directives. For a starter take a look at this. You should only need to fetch your JS code once because you can safely set the cache directive to instruct the browser to hold on to the JS file indefinitely (subject to space). Indefinitely in this context typically means the year 2035.
When you're ready to update all your browser-side caches with a new version of the JS file simply use a cache busting Query String. Any serial number or time and date will do, or a simple version number eg;
<script src="/js/myfile.js?v2.1"></script>
Some minification frameworks handle the cache-busting for you. A good technique for example is those that MD5 the contents and use that as the cache buster query string. That way, whenever your source JS changes the browser will request the new version (because the QS is embedded in your HTML script tag) and then cache for as long as possible again.
XMLHttpRequest will honour the caching primities you set.
In the other part of your question, I believe what you're asking is whether you can download one combined script file and then only refer to parts of it with individual script tags on the page. No - I don't believe you can do that. If you want to refer to individual files you would need to have a HTTP URL and caching directives for each piece of GZIPped content you want to use separately. However, you might find this is as much or maybe even more performant than one big file at first depending on how much parallelisation you can achieve.
A neat trick here is to pre-load a lot of what you need. Google have been doing this on the home page for years. Basically, they pre-load stacks of resources (images certainly, but possibly also JS). So whilst you're thinking about what search query to enter, they are already loading the cache up with stuff you'll want on the subsequent page.
So you could use XMLHttpRequest to fetch your JS files (without parsing them) well before you need them. Then by the time your <script/> tag refers to them they'll already be downloaded and you just need to parse them.

In addition to cirrus's point about using HTTP caching, you could break that still-pretty-large 4mb file down and only load them when that functionality is required.
It's more HTTP requests, but 4MB is a big hit in one go.
Suggest something like require.js to load in the appropriate files when they are needed:
http://requirejs.org/docs/start.html

Is html sent compressed and/or minified?

I'm thinking about using the jQuery Ajax load method. In some cases, the html I want to load is quite large. I'm wondering if the browser already streamlines the process behind the scenes, or should I minify and/or compress the html before calling .load() from jQuery? If so, which one? or both? Is there a standard way to perform minification and/or compressing in this scenario?
UPDATE
Does this make any sense:
The data I'm going to retrieve from the server is static. Let's say I have data for apples, oranges, kumquats, and papayas, and none of it changes "on the fly" (only when I update the site).
So is it preferable that I get the data as Json via jQuery this way:
$.getJson('kumquats')
(...and then, of course, process the results that come back)... OR ...simply send back the html with no need of massaging, as "kumquats" will always send back the exact same html, "oranges" will always be the same html, etc.
In the latter option, then, I would do something like this (jQuery pseudocode) instead:
$('#MainContent").html($.load("\Content\Kumquat.htm"));
In summation, I can send all the html fully-formed across the wire, and clog up the pipes with some extra bits for a bit, OR I can send a less verbose representation of the datta (json), and then massage it in the .getJson() callback function, transforming it into html. Performance-wise, does it make much difference? BTW, this is not "sensitive" data - it doesn't matter who sees it as it zips by through the ether.

I'm wondering if the browser already streamlines the process behind the scenes
The browser can't control how much data the server sends in its response.
or should I minify and/or compress the html before calling .load() from jQuery?
You call load on the client. The server has to do any minification or compression of the HTML.
Is there a standard way to perform minification and/or compressing in this scenario?
Compression is usually handled by gzip encoding. How you set that up depends on your server and/or the server side programming language that is generating the content.
I'm not aware of any standard way to perform minification. I used HTML Tidy to do that once.

The browser can't minify HTML before downloading it first. The only reason you want to minify is to reduce download time by decreasing the file size of the download, so this is counter intuitive.
Your server needs to minify and/or compress. It probably already is compressing by default (mod_deflate on apache for example). Minification of the HTML can be done in a variety of ways depending upon the server-side technology you are using. There may be a library for it, or you could use a third party CDN to minify and serve the content for you.

Single page web app: single html file or several files loaded using ajax?

I have this relatively large web app, it is a single page with ajax calls for the business logic.
Currently I have a small html file that loads all css and js files, and then loads the actual content of the page using ajax, so I have like 15 html files to load a single page (each html file is a "div" in the main html page.
Several files are easier to maintain, but my question is: what is better in terms of performance / User experience?
Keep it as is now (several files loaded async) OR have a script that joins all the files on "compile" time (when deploying)?
I understand that having a single html file is more efficient in terms of network performance, but on the other hand a small file will load faster, and the rest of the content will load after a "loading" dialog.

It is better to have less files as scripts block and load sequentially, or use deferred loading. There is normally a per domain limit for parallel downloads although I cannot for the life of me remember what it is.
For production if you compile a single payload for the scripts together and all of the stylesheets together you will likely reap some performance benefits. I would also consider minifying the output as well. Yahoo Compressor and Google Closure Compiler are two tools that can be used to achieve this.
This will tell you more about the techniques to stop blocking...
http://www.stevesouders.com/blog/2009/04/27/loading-scripts-without-blocking/
Some performance tips, not limited to JavaScript...
http://developer.yahoo.com/performance/rules.html

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008