Why does Amazon have so much white space in its HTML?

Why does Amazon have so much white space in its HTML? - html

If you view the source of the home page, there's somewhere around 14 newlines before the DOCTYPE. For logged-in pages in Seller Central, I'm seeing more than 400! In between components of the page, there will often be 20 newlines of padding, and sometimes much more.
To show I'm not making this up:
Amazon seems to be fanatical about speed, so I'm incredulous that they'd be doing this unintentionally, especially before the DOCTYPE. (Everything else could be a nicely formatted FOR loop that elects not to show some template each iteration, maybe?)
Could they be initiating a connection before the logic of the application is ready to start spitting out code, and "streams" some whitespace?

I recommend that you contact Amazon regarding your query, as only they can provide a valid answer to your question. Below is the link to their 'Contact Us' page:
https://www.amazon.com/gp/help/contact-us/general-questions.html?skip=true
Hope this helps.

My guess it that they are trying to fake a "first byte" short response time.
Many web speed test (ej: http://www.webpagetest.org) value positively a fast first byte is sent to the browser, probably search engines also value this metric, since this is considered an small "processing time".
Printing some white space while the content of the page is been computed gets something to the browser fast and can reduces "first byte" times from 0.5 to 0.0 seconds.
Or maybe it is just a bug :)

Related

Is indention in code affect to website reading of code?

I have a problem. I am creating website and it is working well. But my boss told me that I shouldn't indent my codes because it can affect memory space because it reads the spaces or every indention in the code? Is this true? If yes can you provide me a reference for that so that I can defend myself. My boss show me some website in Japan that didn't indent their code and He ask me If indention is a program standard. If yes why some of the website in Japan didn't indent their code. It is all align left. I forgot some of the websites He showed me because it is in Japanese.
That's all thanks.
Here's the website that has no indention
http://www.dior.com/couture/home/ja_jp

For compiled code, it makes no difference whatsoever unless whitespace is significant in the language and means something that impacts memory usage.
Code transmitted over a network (HTML, CSS, JS, XML, etc.) can be made marginally smaller by removing spaces (in addition to compressing the output, which is considered a best practice). But this should never impact coding style and readability!
Whitespace removal can be done automatically when the page is served. This will actually increase the load on the server slightly (CPU, possibly memory), not reduce it. If the bandwidth savings are worthwhile (doubtful, if compression is enabled), it is an acceptable trade.
This answer shows the savings (or lack thereof) achieved by removing whitespace.
But my boss told me that I shouldn't indent my codes because it can
affect memory space because it reads the spaces or every indention in
the code?
"It" could refer to the web server sending the content or the browser reading the content.
Server: must send bytes of data from disk/memory in response to a request. So yes, extra whitespace may take a trivial amount more memory if the data is buffered and/or cached.
Browser: the browser can discard whitespace it doesn't need. It may never use any memory at all for those extra bytes of whitespace.
So (to be generous) your boss is right, but he is focused on the wrong thing. The savings (if at all) is measured in bytes and nanoseconds.
There are many, many other things that can be optimized first and will yield much more substantial benefits.
Hurting developer productivity is expensive. People indent code to make it more readable, and lack of readability equals lost productivity.

Are white spaces and blank lines in http response affect the speed of downloading the page?

I found by wget some_url that it has so many white spaces and blank lines like
<span class="meta">
someValue
</span>
And the whole document downloaded by wget is in good layout as we can see in Chrome dev tools,Does the document has so many white spaces and blank lines (or tabs) and they're downloaded as well as the main content.
e.g
if the document(also, download by wget or curl) is:
<div class=" someclass">
somevalue
</div>
there're 5 spaces(3 before someclass , 2 before </div>) and 2 blank lines wrapping somevalue
Was it downloaded in tighten form like:
<div class="someclass">somevalue</div>
if not,I'm shock by the fact that some many bandwidth is wasted by these mostly useless information,Are the just the wasted(except they're for layout purpose)?

It is my understanding that whitespace takes up just as much as a character- So technically yes, it's "a waste". However, generally speaking, it is something that you would not ever notice as there are many other things that hinder load time. if You were loading an incredibly large page with a high percentage of whitespace on an incredibly slow network, you might be able to notice.
Generally it is better to think not about how it affects performance (because it doesn't) and think more about whether it makes your code readable. When writing something you need to revisit or show to others, whitespace is very important. When obscuring code so people won't mess with it, getting rid of whitespace can go a long way.

You can set a compression algorithm for the webserver to use with the Content-Encoding header. For example, gzip: http://betterexplained.com/articles/how-to-optimize-your-site-with-gzip-compression/
However, the webserver doesn't have to do it. It's like you're strongly hinting for the webserver to compress your traffic.

Html to Word long document

I can create an extensive word document using html including a cover page, header & footer, page numbers etc.
But my problem is; when my document is too long (like 100 pages or more) and I open the doc with Word 2003:
the document can be loaded and I can see the cover page.
but when I try to scroll down a little bit to examine the report, Word starts a long lasting process ( I don't know what it is) and does not respond.
if the doc is about 60 pages, the process lasts about 5 min. And then I can navigate through the document.
I have tried the following:
Disabled Spelling and Grammar check
Disabled auto-save
Is there anyone with a similar experience? I am creating the document with html and a few vml tags embedded in the document. What can be the cause of this unresponsive behavior?

Word is not created for handling large documents. There are several places where behavior is not O(n log n) (with n length of the document). You need to at least disable page numbering.
If you really want to find out: create some test cases and find out:
start with a plain text only, nothing fancy at all, generate a 100 pages and see if the problem persists.
step by step add features until the problem surfaces (fastest is to halve the feature difference).
it is likely that there are more features than just one leading to these performance problems, so you have to be careful with the application of Newton's method
And when you know, tell us.

Benefits of omitting closing body and html tags?

Are there any benefits to omitting the closing body and html tags?
Why is the google.com homepage missing the closing body and html tags?
edit - with regards to bandwidth, it's an extremely small amount, really. Say 10 million hits # roughly 10 bytes saved ~ 100 mb only... Are there reasons other than bandwidth?
edit 2 - and nope, google (yahoo, microsoft, and others) are not w3-validator compliant... when it comes to bandwidth-saving en mass vs w3-compliance, I guess the latter's good for the sacrificing?

Think of how many times that page is served every day. Even small reductions in size can be significant at that volume of traffic.
(Although I work for Google, please don't treat this as an "official answer from Google" - it's a guess, but an educated one.)

Apart from a gain in bandwidth, there isn't.
Just because they do it you shouldn't.

You're thinking of that as a standalone thing. If they've done several different things that save bandwidth then it all adds up. Each one might not be significant on it's own but if they have 5 or 10 optimisations then it's significant in total.
Also, those ten bytes may well reduce the data size enough to make it take one less TCP/IP packet which will have significantly higher savings that simply reducing the size of one.

I think JB is on the right track. Their home page is about 8750 bytes (for me), meaning that if they can get 1458 bytes per tcp segment, it will be six packets. The point is not so much to make it 8750 bytes rather than 8760 but to make it six packets rather than seven.
This would make a cute Google interview question. :-)
Why does the number of packets matter? Because for modern internet connections, it's the latency that matters, not the roundtrips. A full packet is barely any slower than a 1-byte packet. The value is especially important at the start of a connection when the TCP windows are still opening.
Furthermore, the more packets, the more chance one of them will be lost, perhaps causing a very long delay. The chance is not very high but if you're already as tightly-tuned as they are, it's worth it.
So should you do it? I would say generally not, unless you're confident that you really are already sending just a handful of packets for the page in question. (You should measure it from a realistic client location.) Knowing that the page validates is worthwhile.

Adding to Jon Skeet's answer, this page shows there are 3 billion searches on Google per day. Don't know how accurate it is, but I would guess it's in the ball park.
</body></html> is 14 characters and at 3 billion searches per day, it amounts to approximately 39.12 GB of data per day ignoring compressions, or around 26 GB if we take gzipping into account.
However, Google might actually send the closing tags for body and html for older browsers by looking at their user agents. I cannot confirm or deny this, but looking at modern browsers - Safari, Firefox, and Chrome shows that all are missing the closing tags.
So if you think it works for you, then go for it :). There's no harm in implementing this which may or may not be a micro-optimization for you depending on the scale you're operating at.

According to W3C, body and html tags are optional and can be omitted
An html element's end tag may be omitted if the html element is not immediately followed by a comment.
A body element's end tag may be omitted if the body element is not immediately followed by a comment.
If W3C Recommendation says it is ok now, then it should be totally valid. So there is no reason not to do it, unless you don't like not closed tags

As for me, My ISP is injecting ads into my websites, they insert the script just before </body></html> , so i remove that to avoid their detection.

HTML Line Spacing & Compact Code

I have the following example code:
<body>
<div id="a"></div>
<div id="b"></div>
</body>
If I add empty lines between each of my original lines, like this:
<body>
<div id="a"></div>
<div id="b"></div>
</body>
does that do anything to my site's performance? Will the page load slower?

Yes, compact code speeds up page loading due to decreased payload...but not by a measurable amount, at least in most cases, unless your page is massive you won't see a difference.
Pages should be delivered via gzip, making the size difference between spaced and un-spaced negligible, just do what's readable to you, you'll thank yourself later. As with everything, you have to weight the costs, in this case a very minor difference in payload size, with what's easiest to maintain for you.

In theory, yes.
For the server, if it has to send out a 1MB file to each client, it has to spend n amount of time and resources sending out that one file. Now, if you were able to cut the file size in half, the time and resources it would take per user on the server would be .5n.
For the client, it has to download a file. Assuming a download rate of 25KB/S, a 1MB file would take 41 seconds to download. A .5MB file would take 20.5s. Thats a savings of 20 seconds by reducing the file size.
However, in practice. No, I would not worry about it, unless you're dealing with audio/video/picture data. That's because a character in a HTML document is only a couple bytes. Sure, you might have lets say 100 extra characters that you could trim and remove - whitespace for instance. At most you'd save up an additional 1KB per page.
I wouldn't be too concerned about it, unless you're developing an application or solution where it needs to be compact. But any modern or sub-modern computer won't break with 1KB extra data in their HTML file.

in your example, it saves up 3 bytes of code... so i don't think it has any noticeable effect on page loading time in modern times and it's internet speed. a better improvement would be to send your page gziped.

Page loading and compact code ? yes it really make things better as additional newlines and spaces are nothing but characters which need to be downloaded on the client end.
However i will suggest you to see it as part of the big strategy for optimization.
I will suggest you to take a look at YSlow/Yahoo Guidelines which will help you understand the different parts of "strategy" which is added to server and client components also. And collective results are just amazing for big sites.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008