Is html sent compressed and/or minified?

Is html sent compressed and/or minified? - html

I'm thinking about using the jQuery Ajax load method. In some cases, the html I want to load is quite large. I'm wondering if the browser already streamlines the process behind the scenes, or should I minify and/or compress the html before calling .load() from jQuery? If so, which one? or both? Is there a standard way to perform minification and/or compressing in this scenario?
UPDATE
Does this make any sense:
The data I'm going to retrieve from the server is static. Let's say I have data for apples, oranges, kumquats, and papayas, and none of it changes "on the fly" (only when I update the site).
So is it preferable that I get the data as Json via jQuery this way:
$.getJson('kumquats')
(...and then, of course, process the results that come back)... OR ...simply send back the html with no need of massaging, as "kumquats" will always send back the exact same html, "oranges" will always be the same html, etc.
In the latter option, then, I would do something like this (jQuery pseudocode) instead:
$('#MainContent").html($.load("\Content\Kumquat.htm"));
In summation, I can send all the html fully-formed across the wire, and clog up the pipes with some extra bits for a bit, OR I can send a less verbose representation of the datta (json), and then massage it in the .getJson() callback function, transforming it into html. Performance-wise, does it make much difference? BTW, this is not "sensitive" data - it doesn't matter who sees it as it zips by through the ether.

I'm wondering if the browser already streamlines the process behind the scenes
The browser can't control how much data the server sends in its response.
or should I minify and/or compress the html before calling .load() from jQuery?
You call load on the client. The server has to do any minification or compression of the HTML.
Is there a standard way to perform minification and/or compressing in this scenario?
Compression is usually handled by gzip encoding. How you set that up depends on your server and/or the server side programming language that is generating the content.
I'm not aware of any standard way to perform minification. I used HTML Tidy to do that once.

The browser can't minify HTML before downloading it first. The only reason you want to minify is to reduce download time by decreasing the file size of the download, so this is counter intuitive.
Your server needs to minify and/or compress. It probably already is compressing by default (mod_deflate on apache for example). Minification of the HTML can be done in a variety of ways depending upon the server-side technology you are using. There may be a library for it, or you could use a third party CDN to minify and serve the content for you.

Related

etags vs url parameter for caching

I would like to cache semi dynamic html templates. They will change probably once a week.
Currently i'm seeing two options:
Generate Etag for the html.
Pros: Requires little additional scripting
Cons: Requires a http call for
every resource. (The website can have ~10-20 calls per session)
Use parameters for getting the html. e.g. (http://example.com/header.html?v=5)
Pros: You can set the cache timeout with http headers so it wouldn't need a http call. Probably faster loading time.
Cons: Not as flexible. If expire time is to large it could conflict in the future.
I'm currently thinking about using the second option. What would be the best option and why?
background:
I'm using a CMS to dynamically generate html templates that are used by UI router (Angular) to combine to a full application. I would like to cache the html templates client side so the client would only need to update its files if the content has changed.

individual JS file XMLHttpRequest vs combined gzip download

some stats before i can state the situation,
total JS code = 122 MB
minified = 36 MB
minified and gzip = 4 MB
I would like to get the entire 4 MB down in one shot (with a loading progress indicator on the page), uncompress them, but not parse them yet. We don't want the code expanding in browsers memory when a lot of it might not be required at this point. The parsing should happen when a script tag with the corresponding js file name is encountered.
Intention: faster one shot download of js files, but keeping the behaviour unchanged from the browser perspective.
Do any such solutions exist? Am I even thinking sane?
If yes, I know how to get the gzip, I would like to know how to keep them in the browser cache so that when a script tag is encountered the browser doesn't fire a XMLHttpRequest for it again.

The trick is to leverage HTTP caching directives. For a starter take a look at this. You should only need to fetch your JS code once because you can safely set the cache directive to instruct the browser to hold on to the JS file indefinitely (subject to space). Indefinitely in this context typically means the year 2035.
When you're ready to update all your browser-side caches with a new version of the JS file simply use a cache busting Query String. Any serial number or time and date will do, or a simple version number eg;
<script src="/js/myfile.js?v2.1"></script>
Some minification frameworks handle the cache-busting for you. A good technique for example is those that MD5 the contents and use that as the cache buster query string. That way, whenever your source JS changes the browser will request the new version (because the QS is embedded in your HTML script tag) and then cache for as long as possible again.
XMLHttpRequest will honour the caching primities you set.
In the other part of your question, I believe what you're asking is whether you can download one combined script file and then only refer to parts of it with individual script tags on the page. No - I don't believe you can do that. If you want to refer to individual files you would need to have a HTTP URL and caching directives for each piece of GZIPped content you want to use separately. However, you might find this is as much or maybe even more performant than one big file at first depending on how much parallelisation you can achieve.
A neat trick here is to pre-load a lot of what you need. Google have been doing this on the home page for years. Basically, they pre-load stacks of resources (images certainly, but possibly also JS). So whilst you're thinking about what search query to enter, they are already loading the cache up with stuff you'll want on the subsequent page.
So you could use XMLHttpRequest to fetch your JS files (without parsing them) well before you need them. Then by the time your <script/> tag refers to them they'll already be downloaded and you just need to parse them.

In addition to cirrus's point about using HTTP caching, you could break that still-pretty-large 4mb file down and only load them when that functionality is required.
It's more HTTP requests, but 4MB is a big hit in one go.
Suggest something like require.js to load in the appropriate files when they are needed:
http://requirejs.org/docs/start.html

Minimize size of HTML file

I have a large HTML file being generated for a report at the moment (around 2-3 mb) and this file is going to be transferred a lot of times. It is not being access through any form of a web host, it is just a file being accessed by a network, but the network is all around the world and therefore not fast everywhere.
I know about gzip compression, but from the looks of it that will only work with an apache web server or something similar to configure it via the .htaccess file. I have already stripped the white spaces from the HTML file, my question is besides just zipping it up in a standard archive, what else can I do to minimize the size of the file?
Thanks, and I will be happy to answer any other questions.

You can certainly look at the HTML structure itself to see if you can reduce the number of tags themselves. For example to you have a bunch of nested table structures that could be replaced? Do you have inline styles that could be put into a separate stylesheet? Do you have any javascript content which could be put into a separate file?

I does not think that you can compress it without a proper web server, because is the web server that say to the browser that the file is to unzip in the HTTP response.
If the format is the greater part of the file (i.e. there are more tags and script than the text) you can use a css to minimize the size.
If the data is the greater, so information are the more than tags, I suggest you to use a web server (also with the Microsoft IIS you can compress it)
But, if possible, consider also to split the data in several file, with different level of details for example

It is possible to contain compressed data within the HTML file and use a JavaScript to dynamically compress the data as the page is rendered using a JavaScript implementation of the Decompression module. See this answer for references: JavaScript implementation of Gzip

Object-oriented HTML without server side code. Possible?

Is it possible to reuse HTML tags across multiple files, headers and footers for example? Placing them in separate files adds an extra HTTP request, that I'd like to avoid.
I don't want to replicate minor changes in headers and footers across every html file every time a change request comes along.

HTML is not a programming language - it's a markup language. You don't do object-oriented HTML because it isn't object based. This is the whole purpose of a server-side language, so you can make include files and use them in your server-side application.
If you have Apache however, you can use server-side includes which don't require a programming language such as PHP, but it's less flexible:
<!--#include virtual="/footer.html" -->

First, HTML isn't even a programming language, so it's impossible to have "Object-oriented" HTML.
Placing them in separate files adds an
extra HTTP request, that I'd like to
avoid.
If this is the reason for your "without server side code" requirement, then you are mistaken - the client does not fetch the templates that make up a page separately; the server side code will return a single HTML page to the client.
If, on the other hand, you don't have the option to run any server-side code at all and have to make do with static HTML pages, then there's only two options I can think of: iframes (which do result in separate HTTP requests, of course), or some sort of tool that basically runs the equivalent of server-side code to embed your reused templates everywhere and spits out the result to be uploaded to the server. You can have this effect by running a PHP/Apache-with-SSI/JSP/Whatever server on your development machine and using wget to make a static snapshot of the pages.

What I want to do is this. The files can be scattered during development. But I when I'm ready to release, a toolkit should compile the included files into a single html file.
You can use a template language/engine, such as jinja2.
You can layout files in a certain hierarchy, and have templates inherit from other templates, and include other templates, and define reusable macros (closest thing to what you referred to as "reusable tags").

What I want to do is this. The files can be scattered during development. But I when I'm ready to release, a toolkit should compile the included files into a single html file
I know this is late, but CodeKit's .kit language lets you do exactly what you were saying.
http://incident57.com/codekit/help.php

I think the language you've chosen in your question (object oriented HTML) is actually masking the real issue you have here...
What I want to do is this. The files can be scattered during development. But I when I'm ready to release, a toolkit should compile the included files into a single html file.
This sounds like a job for a preprocessor, I don't believe it has anything to do with your webserver or server side technology, as this is a step which would happen before deployment.
There's a number of text pre-processors available eg M4 - hell you could even use the C compiler pre-processor if you wanted. A quick google reveals that there are specialised pre-processors for HTML as well....
Automatic file inclusion, automatic escaping, and whatnot that can be done with automatically inserted headers and footers, chosen based on path patterns.
Seems to fit the bill?

Sure . But these would have to be separate ajax calls form the client . There are lot of javascript mvc frameworks like that do that .

If you want to have include files during development, then compile them into free-standing HTML files, you could do that by spidering your development server with wget: whatever server-side technology you use will combine the files and return the HTML, which wget will saves as one file.

As everithing is object over the technology but not directly, indirectly interacting with the object that are created at different level as per security implementation.

You can do this.
I just released a mature framework called Hypertag that is, in fact, Object Oriented HTML. It is entirely client-side, in continuous development, and allows for very interesting, yet HTML-compatible, advanced solutions for logic and layout.
See http://hypertag.io for more.

ajax html vs xml/json responses - performance or other reasons

I've got a fairly ajax heavy site and some 3k html formatted pages are inserted into the DOM from ajax requests.
What I have been doing is taking the html responses and just inserting the whole thing using jQuery.
My other option is to output in xml (or possibly json) and then parse the document and insert it into the page.
I've noticed it seems that most larger site do things the json/xml way. Google Mail returns xml rather than formatted html.
Is this due to performance? or is there another reason to use xml/json vs just retrieving html?
From a javascript standpoint, it would seem injecting direct html is simplest. In jQuery I just do this
jQuery.ajax({
type: "POST",
url: "getpage.php",
data: requestData,
success: function(response) {
jQuery('div#putItHear').html(response);
}
with an xml/json response I would have to do
jQuery.ajax({
type: "POST",
url: "getpage.php",
data: requestData,
success: function(xml) {
$("message",xml).each(function(id) {
message = $("message",xml).get(id);
$("#messagewindow").prepend("<b>" + $("author",message).text() +
"</b>: " + $("text",message).text() +
"<br />");
});
}
});
clearly not as efficient from a code standpoint, and I can't expect that it is better browser performance, so why do things the second way?

Returning JSON/XML gives the application more freedom compared to returning HTML, and requires less specific knowledge in different fields (data vs markup).
Since the data is still just data, you leave the choice of how to display it to the client side of things. This allows a lot of the code to be executed on the client side instead of on the server - the server side needs to know only about data structures and nothing about markup. All the programmer needs to know is how to deliver data structures.
The client implementation only needs to know about how to display the data structures returned by the server, and doesn't need to worry about how these structures actually get build. All the programmer needs to know is how to display data structures.
If another client is to be build (that doesn't use HTML as a markup language), all the server components can be reused. The same goes for building another server implementation.

It will normally reduce the amount of data transferred and therefore improve transfer speed. As anything over-the-wire is normally the bottleneck in a process reducing the transfer time will reduce the total time taken to perform the process, improving user experience.

Here are a few pros for sending JSON/XML instead of HTML:
If the data is going to ever be used outside of your application HTML might be harder to parse and fit into other structure
JSON can be directly embedded in script tags which allows cross domain AJAX scenarios
JSON/XML preserves the separation of concerns between the server side scripts and views
Reduces bandwidth

You should check out Pure, a templating tool
to generate HTML from JSON data.

Generally JSON is a more efficient way to retrieve data via ajax as the same data in XML is a lot larger. JSON is also more easily consumed by your client side Javascript. However, if you're retrieving pure HTML content I would likely do as you suggest. Although, If you really needed to, you could embed your HTML content within a JSON string and get the best of both worlds

I'm currently wrestling with this decision too and it didn't quite click until I saw how Darin boiled it down:
"If the data is going to ever be used outside of your application HTML might be harder to parse and fit into other structure"
I think a lot of it is where/how the data is going. If it's a one-off application that doesn't need to share/send data anywhere else, then spitting back pure HTML is fine, even if it does weigh more.
Personally, if there is complex HTML to be wrapped around the data, I just spit back the HTML and drop it in. jQuery is sweet and all, but building HTML with Javascript is often a pain. But it's a balance game.

In some cases, AJAX responses need to return more information than just the HTML to be displayed. For example, let's say you are returning a list of the first twenty items from a search. You may need to return the total number of search results to be displayed somewhere else in the DOM. You could try piggybacking the total count in a hidden div, but that can get messy. With JSON, the total count can simply be a field value a structured JSON response.

To me it boils down to this:
It's for many of us, much less work to use a server side, mature, template engine that we're accustomed to, to generate html and send it down the pipe, than using a bunch of javascript code to generate HTML client side. Yes, there are some templating engines for javascript now which may mitigate it somewhat.
Since I already separate model, logic and views server side, there is no argument in having yet another separation. JSON is a view, HTML is another view.
And lets face it; both HTML/AJAX and JSON/AJAX are many times better than full page over the pipe.
The final thing you perhaps need to think about is; if you're going to be search engine friendly - you might have to generate the HTML server side any way (the old degrade gracefully mantra).
I usually do a combination. If there is client side logic, I use JSON - else I use HTML. Notifications and autocomplete special fields are sent via JSON.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008