The question is pretty self explanatory. Why shouldn't I strip it? It seems to me that most of the whitespace is used purely for formatting in the text editor and has no impact on the final page.
What's more, when these random nodes of whitespace do have an impact on the final page, it is usually an impact I do not want, such as a mysterious one character (after whitespace collapse) gap between inline-blocks.
I can strip all these whitespace text nodes pretty easily. Is there any reason I shouldn't?
edit:
It's mainly for the strange behaviour where whitespace, rather than for performance. One example is me wanting to put images side by side using inline-block instead of float, while preventing wrapping to next line and allowing them to spill out of the parent.
The whitespace causes these mysterious gaps, which can be removed by basically minifying the HTML source code to remove the whitespace between inline-blocks manually (and completely messing up your source code formatting in the process).
There's no reason not to, really. It can be done very easily with something like htmlcompressor.
However, assuming you're delivering all your html, css, and js files via gzip, then the amount of real-world bandwidth savings you'll see from stripping whitespace will be very small. The question then becomes, is it worth the trouble?
UPDATE:
Perhaps this will affect your decision. I performed a simple minification on a page of my website just to see what kind of difference it would make. Here are the results:
BEFORE minification
22232 bytes (uncompressed)
5276 bytes (gzip)
AFTER minification
19207 bytes (uncompressed)
5146 bytes (gzip) - 130 bytes saved
The uncompressed file is about 3 KB smaller after minification. But that's not really what matters. The gzip compressed file is what is sent over the wire. And you can clearly see that gzip does a pretty good job even with the non-minified HTML.
I see the benefit of minifying js libraries, or things that aren't changing constantly. But I don't think it's worth the trouble doing this to your HTML for a measly 130 bytes.
Let me give one reason why you shouldn't minify html:
How html eventually gets rendered is strongly tie to the CSS applied up on it, but the minifiers usually work without expecting the influence of CSS. All minifiers you can get out there at the time of writing, they remove the spaces in html based on certain assumptions of your coding and CSS styling, if you don't code it the way they expected, the minified rendering result in browser will be different from before minification.
For example, some of minifiers assume the space between "block elements" (such as <div/>, <p/>) can be removed, this is usually true, because spaces between them has no effect on rendering the final result. But what if in the CSS you set "display: inline" or "inline-block" for elements whose default display property is block?
Will below html snippet still rendering as it should be if you remove the spaces between <div/>s ?
<div style="display: inline">will</div> <div style="display: inline">this</div> <div style="display: inline">still</div> <div style="display: inline">work?</div>
You may argue that, we can reserve at least 1 space, and remove remaining consecutive spaces and that still save a lot bytes. Then how about <pre> tag and white-space: pre?
Try copy the html code snippet from below url and paste into your minifier, see if it produces result as before the minification:
https://jsfiddle.net/normanzb/58rpazL2/
The only downside of stripping out whitespace from production pages is readability, and maintainability for the person that follows you in editing that/those page(s); but if you maintain a 'properly'/'readable' whitespaced-version for editing, and then minify that post-editing to form the production pages then it doesn't really cause significant problems.
I'm not sure how effective, or useful, the technique will be, but there's nothing to stop you trying it.
Short answer: no reason whatsoever
The only real purpose white space serves is to make the code more human-readable. You can, over time, save a lot of bandwidth by stripping all the unnecessary white space out of your documents and it should be considered good practice for production code. If your compressing your content the saving will be less, but even 1% of 1GB is 10MB... If your doing 100GB in a month on a busy web site, cutting out 1% of the data might be the difference between two pricing tiers of hosting...
As you say, some browsers (usually IE, grrrr....) will occasionally interpret the white space when they render the page, but usually when this happens it's in a way you'd rather it hadn't...
Related
I have a problem. I am creating website and it is working well. But my boss told me that I shouldn't indent my codes because it can affect memory space because it reads the spaces or every indention in the code? Is this true? If yes can you provide me a reference for that so that I can defend myself. My boss show me some website in Japan that didn't indent their code and He ask me If indention is a program standard. If yes why some of the website in Japan didn't indent their code. It is all align left. I forgot some of the websites He showed me because it is in Japanese.
That's all thanks.
Here's the website that has no indention
http://www.dior.com/couture/home/ja_jp
For compiled code, it makes no difference whatsoever unless whitespace is significant in the language and means something that impacts memory usage.
Code transmitted over a network (HTML, CSS, JS, XML, etc.) can be made marginally smaller by removing spaces (in addition to compressing the output, which is considered a best practice). But this should never impact coding style and readability!
Whitespace removal can be done automatically when the page is served. This will actually increase the load on the server slightly (CPU, possibly memory), not reduce it. If the bandwidth savings are worthwhile (doubtful, if compression is enabled), it is an acceptable trade.
This answer shows the savings (or lack thereof) achieved by removing whitespace.
But my boss told me that I shouldn't indent my codes because it can
affect memory space because it reads the spaces or every indention in
the code?
"It" could refer to the web server sending the content or the browser reading the content.
Server: must send bytes of data from disk/memory in response to a request. So yes, extra whitespace may take a trivial amount more memory if the data is buffered and/or cached.
Browser: the browser can discard whitespace it doesn't need. It may never use any memory at all for those extra bytes of whitespace.
So (to be generous) your boss is right, but he is focused on the wrong thing. The savings (if at all) is measured in bytes and nanoseconds.
There are many, many other things that can be optimized first and will yield much more substantial benefits.
Hurting developer productivity is expensive. People indent code to make it more readable, and lack of readability equals lost productivity.
I found by wget some_url that it has so many white spaces and blank lines like
<span class="meta">
someValue
</span>
And the whole document downloaded by wget is in good layout as we can see in Chrome dev tools,Does the document has so many white spaces and blank lines (or tabs) and they're downloaded as well as the main content.
e.g
if the document(also, download by wget or curl) is:
<div class=" someclass">
somevalue
</div>
there're 5 spaces(3 before someclass , 2 before </div>) and 2 blank lines wrapping somevalue
Was it downloaded in tighten form like:
<div class="someclass">somevalue</div>
if not,I'm shock by the fact that some many bandwidth is wasted by these mostly useless information,Are the just the wasted(except they're for layout purpose)?
It is my understanding that whitespace takes up just as much as a character- So technically yes, it's "a waste". However, generally speaking, it is something that you would not ever notice as there are many other things that hinder load time. if You were loading an incredibly large page with a high percentage of whitespace on an incredibly slow network, you might be able to notice.
Generally it is better to think not about how it affects performance (because it doesn't) and think more about whether it makes your code readable. When writing something you need to revisit or show to others, whitespace is very important. When obscuring code so people won't mess with it, getting rid of whitespace can go a long way.
You can set a compression algorithm for the webserver to use with the Content-Encoding header. For example, gzip: http://betterexplained.com/articles/how-to-optimize-your-site-with-gzip-compression/
However, the webserver doesn't have to do it. It's like you're strongly hinting for the webserver to compress your traffic.
I have a big performance issue in IE6 even with javascript turned-off.
It's strange because sometimes when the page is loaded, the header is floated next to the footer, or slideshow is over the the content.
I wonder if someone had same or similiar issues in IE6 and if i minify a generated source code into a one-line, will it help somehow to gain loadspeed in that browser ?
-Want to mention that it should be compatible with ie6 so please, do not post a messages like - use modern browsers.
The problem was in MS png fixer inside css. Even if i turned off a javascript, it still works, so when i removed css lines with ms filter for png transparency, it starts working like it should.
Thanks for any submits.
I doubt that removing newlines would increase the speed in any noticeable fashion.
That is, the performance issues are likely not caused by line count but rather the size/number/type/cost of the elements/operations after the parsing.
The actual lexer that handles the newlines should see them no differently in the stream than any other character. Depending on exactly what context "source" means newlines have some effect semantically at the parser:
CSS: none (ignoring embedded newlines)
HTML: possible new text-nodes or different content
JavaScript: automatic semicolon insertion (or embedded newlines)
However, there is no reason not to try a minified version quickly to see if it makes a difference and, more importantly, to satisfy your curiosity ;-) I would definitely heed the other suggestions as well, such as checking the page (everything) for validity.
Happy coding.
You haven't specified what your page consists of, but my guess would be you're outputting the mother of all HTML tables?
The reason I guess that is because IE6 is known to be extremely slow when rendering large tables, particularly where the column widths are variable. (later IEs are also slow with this, but not as bad as IE6)
The reason is that the browser attempts to render the entire table before displaying anything, and performs a lot of calculations to work out how to render it.
The answers on this question may also help you: Are large html tables slow?
Sometimes when I look at style sheets of big websites (even this one) the css code is completely formated (or however you call it), like this: http://cdn.sstatic.net/stackoverflow/all.css
Is this just the result of a style sheet beeing generated by a CMS ?
I call it "minified", and I think that's the general term. But the reason is to reduce loading times. All those useless spaces and comments still count as bytes, and sometimes you can have more spaces and comments than actual effective characters! (It also obfuscates the stylesheets, although that's really pointless as spaces can easily be restored with whatever formatting you need.)
It's probably generated on the fly from a more scriptable/dynamic/dry layout language, and there is simply no reason to add the whitespaces since non-one should be reading them, and it would only add to the file-size.
It can be generated by CMS or manually. Removing all the tabs and spaces reduces the size of the file, thereby loading it faster an inturn can make a site faster.
I started wondering what is the overall impact of using whitespaces to indent html documents.
Why not simply use tabs to indent? Wouldn't this be more cost-effective: 1 char (\t) vs. example 4 chars (spaces)?
I did little experimenting by converting an asp.net-page to use tabs and compared sizes of rendered markups.
By replacing only one partial view's white space caused a page of 22kb size to be reduced to 19,4kb -> that's 12% reduction. Changing all indentation, page ended up allocating 16,7kb - 24% reduction! (used chrome dev tools and Fiddler for verifying)
Is my reasoning sound? Should tabs be used primary for indentation of HTML? Is there any reason to use spaces(such as compatibility with exotic browsers)?
ps. Stackoverflow seems to use spaces too. Converting SO main page to use tabs gave 9% reduction. Is this valid observation? If so, why haven’t they used tabs?
StackOverflow uses HTTP Compression - when this is turned on, the differences between using spaces versus tabs goes down - a lot.
You need to run your tests against the compressed versions for reliable results.
You do have a point though for the cases when a browser does not support the compression schemes the server supports.
First thing : html doesn't have a rule of doing indentation. It's done by programmers for code readability and program's structure. More ever We can reduce size taken by indents and white spaces by compression.
Minify/compact/compressing HTML : Compacting HTML code, can save many bytes of data and speed up downloading, parsing, and execution time.
StackOverflow uses HTTP Compression
Minifying HTML has the same benefits as those for minifying CSS and JS: reducing network latency, enhancing compression, and faster browser loading and execution. Moreover, HTML frequently contains inline JS code (in tags) and inline CSS (in tags), so it is useful to minify these as well.
Note: This rule is experimental and is currently focused on size reduction rather than strict HTML well-formedness. Future versions of the rule will also take into account correctness. For details on the current behavior, see the Page Speed wiki.
Tip: When you run Page Speed against a page referencing HTML files, it automatically runs the Page Speed HTML compactor (which will in turn apply JSMin and cssmin.js to any inline JavaScript and CSS) on the files and saves the minified output to a configurable directory.
Refer : http://code.google.com/speed/page-speed/docs/payload.html#MinifyHTML
Why not simply use tabs to indent? Wouldn't this be more cost-effective: 1 char (\t) vs. example 4 chars (spaces)?
If you're worried about downloaded HTML size, you won't fuss over tabs-vs-spaces — you'll compress your HTML as it goes over the wire and minify your markup, CSS, and Javascript, which provide real savings and don't interfere with your own coding guidelines.