Pre-process html content with browsers - html

Unfortunately, for some special purpose, some domain was disabled by the GOVERNMENT FIREWALL
For example: any domain name *.gg.com was unreachable through http.
The problem is, many websites request sources from this domain, this inconvenience make many websites frequently used loading extremely slow.
There is solution, because the shielding is only on the domain, but not on the ip, so if I use a cname record to the domain, the contents can load!
For example:
If I want to request http://script.gg.com/jquery.js
I can set script.mydomain.com CNAME to script.gg.com
And request http://script.mydomain.com/jquery.js is good.
I want to use this feature automatically on my browser. I mainly use Google Crome.
I'm wondering if there is a way to pre-process the html the browser loaded, and replace all resource link domain, mapping from the banned list to a valid list?
I might thought there can be a plugin or something else, find or develop myself, can anyone help?

Maybe you can preprocess it with javascript.
Firefox has greasemonkey plugin, but I think Chrome probably has something similar. This javascript is executed at the very beginning of page load and maybe you can replace the urls with some javascript and thus making browser load the replaced urls instead of those written in actual code.
Update:
Chrome has Tampermonkey.
Tested with the script and it worked okay:
var links = document.getElementsByTagName('a');
for (var i=0;i<links.length;i++){
links[i].href = links[i].href.replace(".google.", ".bing.");
}

Related

Use chrome extension to trick page into thinking it's not in an iFrame

Is there a way to create a Chrome extension to trick a site loaded in an iFrame into thinking it's not in a frame?
We load clients' sites into an iframe for demos, but some resources get blocked due to them disallowing being loaded in an iFrame. We'd like to load these sites into a frame as though you were browsing directly to the site in a standalone tab.
You should use the Chrome's webRequest in order to intercept the server response. See the API. Here you go for onHeadersReceived event where you are in control of any response headers => you need to remove X-Frame-Options header from the response.
That's pretty much it, if this is the only problem in loading those sites.
However, for the sake of completeness, in order to fully trick the browser (which you most likely do not need) you need also to inject a script into every page that would clear up some things like window.parent by simple removing them from window object and some other things like origin etc. However removing the header would work for 99.9999% of your use cases.

Attachment of external content - forcing although X-Frame-Option=SAMEORIGIN

I read more in the Internet, but I didn't managed to find solution to this problem:
Is it possible to attach some external content in case of sending X-Frame-Option=SAMEORIGIN by server ?
I know that <iframe> can't be used, however maybe there exists some another way.
Thanks in advance
No, it's not possible to show another page's contents within your website if they are setting the HTTP header X-Frame-Options: SAMEORIGIN. That header says that the page can only be embedded on pages on the same domain name.
However, if you are running your own server-side application (i.e. using PHP, Node.js, etc), you can scrape the website on your server, and then display whatever info you needed from the other site that way. It will be more work this way, and you probably won't be able to perfectly replicate how everything appeared on the source site, but it's the only route you've got. I suggest googling "scraping" + the name of your server-side language/environment to learn how to do this.

Difference between href="#!" and href="#" [duplicate]

I've just noticed that the long, convoluted Facebook URLs that we're used to now look like this:
http://www.facebook.com/example.profile#!/pages/Another-Page/123456789012345
As far as I can recall, earlier this year it was just a normal URL-fragment-like string (starting with #), without the exclamation mark. But now it's a shebang or hashbang (#!), which I've previously only seen in shell scripts and Perl scripts.
The new Twitter URLs now also feature the #! symbols. A Twitter profile URL, for example, now looks like this:
http://twitter.com/#!/BoltClock
Does #! now play some special role in URLs, like for a certain Ajax framework or something since the new Facebook and Twitter interfaces are now largely Ajaxified?
Would using this in my URLs benefit my Web application in any way?
This technique is now deprecated.
This used to tell Google how to index the page.
https://developers.google.com/webmasters/ajax-crawling/
This technique has mostly been supplanted by the ability to use the JavaScript History API that was introduced alongside HTML5. For a URL like www.example.com/ajax.html#!key=value, Google will check the URL www.example.com/ajax.html?_escaped_fragment_=key=value to fetch a non-AJAX version of the contents.
The octothorpe/number-sign/hashmark has a special significance in an URL, it normally identifies the name of a section of a document. The precise term is that the text following the hash is the anchor portion of an URL. If you use Wikipedia, you will see that most pages have a table of contents and you can jump to sections within the document with an anchor, such as:
https://en.wikipedia.org/wiki/Alan_Turing#Early_computers_and_the_Turing_test
https://en.wikipedia.org/wiki/Alan_Turing identifies the page and Early_computers_and_the_Turing_test is the anchor. The reason that Facebook and other Javascript-driven applications (like my own Wood & Stones) use anchors is that they want to make pages bookmarkable (as suggested by a comment on that answer) or support the back button without reloading the entire page from the server.
In order to support bookmarking and the back button, you need to change the URL. However, if you change the page portion (with something like window.location = 'http://raganwald.com';) to a different URL or without specifying an anchor, the browser will load the entire page from the URL. Try this in Firebug or Safari's Javascript console. Load http://minimal-github.gilesb.com/raganwald. Now in the Javascript console, type:
window.location = 'http://minimal-github.gilesb.com/raganwald';
You will see the page refresh from the server. Now type:
window.location = 'http://minimal-github.gilesb.com/raganwald#try_this';
Aha! No page refresh! Type:
window.location = 'http://minimal-github.gilesb.com/raganwald#and_this';
Still no refresh. Use the back button to see that these URLs are in the browser history. The browser notices that we are on the same page but just changing the anchor, so it doesn't reload. Thanks to this behaviour, we can have a single Javascript application that appears to the browser to be on one 'page' but to have many bookmarkable sections that respect the back button. The application must change the anchor when a user enters different 'states', and likewise if a user uses the back button or a bookmark or a link to load the application with an anchor included, the application must restore the appropriate state.
So there you have it: Anchors provide Javascript programmers with a mechanism for making bookmarkable, indexable, and back-button-friendly applications. This technique has a name: It is a Single Page Interface.
p.s. There is a fourth benefit to this technique: Loading page content through AJAX and then injecting it into the current DOM can be much faster than loading a new page. In addition to the speed increase, further tricks like loading certain portions in the background can be performed under the programmer's control.
p.p.s. Given all of that, the 'bang' or exclamation mark is a further hint to Google's web crawler that the exact same page can be loaded from the server at a slightly different URL. See Ajax Crawling. Another technique is to make each link point to a server-accessible URL and then use unobtrusive Javascript to change it into an SPI with an anchor.
Here's the key link again: The Single Page Interface Manifesto
First of all: I'm the author of the The Single Page Interface Manifesto cited by raganwald
As raganwald has explained very well, the most important aspect of the Single Page Interface (SPI) approach used in FaceBook and Twitter is the use of hash # in URLs
The character ! is added only for Google purposes, this notation is a Google "standard" for crawling web sites intensive on AJAX (in the extreme Single Page Interface web sites). When Google's crawler finds an URL with #! it knows that an alternative conventional URL exists providing the same page "state" but in this case on load time.
In spite of #! combination is very interesting for SEO, is only supported by Google (as far I know), with some JavaScript tricks you can build SPI web sites SEO compatible for any web crawler (Yahoo, Bing...).
The SPI Manifesto and demos do not use Google's format of ! in hashes, this notation could be easily added and SPI crawling could be even easier (UPDATE: now ! notation is used and remains compatible with other search engines).
Take a look to this tutorial, is an example of a simple ItsNat SPI site but you can pick some ideas for other frameworks, this example is SEO compatible for any web crawler.
The hard problem is to generate any (or selected) "AJAX page state" as plain HTML for SEO, in ItsNat is very easy and automatic, the same site is in the same time SPI or page based for SEO (or when JavaScript is disabled for accessibility). With other web frameworks you can ever follow the double site approach, one site is SPI based and another page based for SEO, for instance Twitter uses this "double site" technique.
I would be very careful if you are considering adopting this hashbang convention.
Once you hashbang, you can’t go back. This is probably the stickiest issue. Ben’s post put forward the point that when pushState is more widely adopted then we can leave hashbangs behind and return to traditional URLs. Well, fact is, you can’t. Earlier I stated that URLs are forever, they get indexed and archived and generally kept around. To add to that, cool URLs don’t change. We don’t want to disconnect ourselves from all the valuable links to our content. If you’ve implemented hashbang URLs at any point then want to change them without breaking links the only way you can do it is by running some JavaScript on the root document of your domain. Forever. It’s in no way temporary, you are stuck with it.
You really want to use pushState instead of hashbangs, because making your URLs ugly and possibly broken -- forever -- is a colossal and permanent downside to hashbangs.
To have a good follow-up about all this, Twitter - one of the pioneers of hashbang URL's and single-page-interface - admitted that the hashbang system was slow in the long run and that they have actually started reversing the decision and returning to old-school links.
Article about this is here.
I always assumed the ! just indicated that the hash fragment that followed corresponded to a URL, with ! taking the place of the site root or domain. It could be anything, in theory, but it seems the Google AJAX Crawling API likes it this way.
The hash, of course, just indicates that no real page reload is occurring, so yes, it’s for AJAX purposes. Edit: Raganwald does a lovely job explaining this in more detail.

Google bot crawling on AngularJS site with HTML5 Mode routes

We have an AngularJS site using HTML5 routes. I just did some test "Fetch as Google" runs. The results are a bit confusing:
On the fetching tab, I see our site as it looks on view source, with all the front end bindings {{ }}, and not all the HTML rendered
On the rendering tab, our site looks perfectly fine, no {{ }} variables, it seems like Google bot fetched and rendered the site fine, which is maybe in line with this, http://googlewebmastercentral.blogspot.ae/2014/05/rendering-pages-with-fetch-as-google.html.
However, we are already prepared for Google to not be able to crawl our site, so we have already added , so the Google bot revisits our page with “?_escaped_fragment_=". We followed this, https://developers.google.com/webmasters/ajax-crawling/docs/getting-started (section "3. Handle pages without hash fragments"). In our Nginx config we have something like this:
if ($args ~ "_escaped_fragment_=") {
serve the static HTML snapshots
}
, and indeed it works fine, if we pass the _escaped_fragment_= ourselves. However, the Google bot never tried to crawl our site with this param, so it never crawled the snapshot. Are we missing something? Should we also add agent detection for Google bot on our Nginx conf? Something like this?
if ($http_user_agent ~* "googlebot|yahoo|bingbot|baiduspider|yandex|yeti|yodaobot|gigabot|ia_archiver|facebookexternalhit|twitterbot|developers\.google\.com") {
server from snapshots
}
It would be great if we can understand this better, thank you so much in advance!
UPDATE:
I just read this, http://scotch.io/tutorials/javascript/angularjs-seo-with-prerender-io?_escaped_fragment_=tag#caveats. So, it seems that when using the manual tools (Fetch as Google), we should pass ourselves either #! or ?_escaped_fragment_= in the right place. Indeed, if I pass ?_escaped_fragment_= in our case, I do see the HTML snapshot that we have created.
Is that true? Is this how it works indeed?
UPDATE 2
On the bottom of this thread, a Google employee verifies that for Google Webmasters "Fetch as Google", you need to manually pass the _escaped_fragment_= param yourself, https://productforums.google.com/forum/#!msg/webmasters/fZjdyjq0n98/PZ-nlq_2RjcJ
Cheers,
Iraklis
I will try to answer your questions based on our experiences in the last month of developing a SPA with HTML5 mode.
How do I get Googlebot to use ?_escaped_fragment_= instead of the direct links.
This is actually quite simple but easy to overlook. In fact, there are two different ways to get Googlebot to try the escaped_fragment. The first method is to run your site in non-html5 mode. This means that your URLs will be of the form:
http://my.domain.com/base/#!some/path/on/website
Googlebot recognizes the #! and makes a second call to your server with an altered URL:
http://my.domain.com/base/?_escaped_fragment_=some/path/on/website
Which you can then handle as you wish. The second way to get Googlebot to try _escaped_fragment_ mode is to include the following meta tag on the index page you supply to the bot:
<meta name="fragment" content="!">
This will make googlebot check the other version of the webpage every time it sees the tag. Interestingly you can use both these techniques together or you can do what we ended up doing, which is running in html5 mode with the meta tag. This means that your URLs will be escaped as follows:
http://my.domain.com/base/some/path/on/website?_escaped_fragment_=
Interestingly, the bot will not put anything at the end of the fragment. But depending on what webserver you are running, you can easily map this with a pattern matching the "_escaped_fragment_" text to your alternate bot page. For more information on the escaped fragment go here.
"Fetch as Googlebot" returns two different versions of my page, the source with {{}} and the rendered page looking correct. What does that mean?
Google's Bots can actually interpret JavaScript to a limited extent since early 2014. For more information, read the official blog entry on google webmasters here. However, as is made clear in the blog entry, this comes with a lot of caveats. For instance:
Googlebot does not guarantee to execute all javascript code.
Googlebot will attempt to find links in the javascript to follow and use them to help find more pages.
Googlebot will render the preview in webmasters tools by executing as much of the javascript as it can (thus the lack of {{}} in the rendered version).
Googlebot will not necessarily use the rendered version in order to build the meta information about your site for its index.
As of 18/12/2014, we are still unsure if Googlebot can actually extract any information from an SPA in rendered mode for its index beyond finding links to follow in the javascript. In our experience, Googlebot will include {{}} in its index listing so that when you try to use {{}} to fill meta information (description, keywords, title, etc...) your site looks like this in Google Search results:
{{meta.siteTitle}}
http://my.domain.com/base/some/path/on/website
{{meta.description}}
rather than what you expect which might look like this:
Domain
http://my.domain.com/base/some/path/on/website
This is a random page on my domain. An excellent example page to be sure!
GoogleBot for Search Engine uses _escaped_fragment_ but we can not be sure for other services
Google recommend to serve an HTML snapshot of AJAX website by using hashbang (#!) and _escaped_fragment_ param.
But as often for new Google feature all Google services do not support it from the begging.
For now, by experience, we are sure GoogleBot for indexing webpage use HTML snapshot and _escaped_fragment_. You can check your Server Access Logs to be sure Google did it on your application.
(For now and by experience, nothing official by Google) other services like PageSpeed Insight, Webmaster Tools parser, Richsnippet testing tools, etc.: hasbang (#!) is not supported. You have to use _escaped_fragment_.
Should you use User Agent detection to serve HTML snapshot?
No. Just don't. For different reasons :
You just do not know which services/bots on the web would like to parse your content and you can not be exhaustive (for instance, think of all the social networks existing on the web using Bot to create a snippet of your content : you can not handle them one by one)
This can be considered as cloacking : serving a different version depending on type of user on the same URL, which is basically wrong for SEO.
Google looks for #! in our site urls and then takes everything after the #! and adds it in _escaped_fragment_ query parameter. Some developers create basic html pages with real data and serve these pages from server side at the time of crawling. So , why not we render same pages with PhantomJS on serve side which has _escaped_fragment_.
For more detail please read this blog .
Maybe a bit outdated, but for the completeness:
According to the statement from May 23, 2014 Google bot is now able to "see your content more like modern Web browsers".
According to their statement from October 14, 2015 Google deprecated the AJAX crawling scheme.
So using the HTML5 History API (html5mode in angular) should be no problem to Google.

Sourcecode in the url

normally you go on a website and by right click you can choose to see the source code. Or you just use firebug and select an element you want to analyse. Is it possible to write the source code in the URL so that it wouldn't be shown by right click + choosing or selecting an element?
I'm asking because I've already seen this phenomenon once by using an iphone simulator in safari.
Any ideas or hints what I'm exactly looking for? Your help would be great.
Edit: Based on wrong information. You can see the sourcecode by rightclicking. But the url still contains all information about the site. I'll get back to you as soon as I got more information to write them down clearly. Sorry for all the confusion.
Edit: This is the code in the url containing information about the site.
data:text/html;charset=utf-8;base64,PCFET0NUWVBFIGh0bWw%2BDQo8aHRtbCBtYW5pZmVzdD0naHR0cDovL25vdm93ZWIubWZ1c2UuY29tL3dlYmFwcC9TcG9ydGluZ2JldC9wb3J0YWwvc3BvcnRpbmdiZXRQb3J0YWwubWFuaWZlc3QnPg0KPGhlYWQ%2BPHRpdGxlPlNwb3J0aW5nYmV0PC90aXRsZT4NCiAgICA8bWV0YSBodHRwLWVxdWl2PSdjb250ZW50LXR5cGUnIGNvbnRlbnQ9J3RleHQvaHRtbDsgY2hhcnNldD11dGYtOCc%2BDQoJPG1ldGEgbmFtZT0ndmlld3BvcnQnIGNvbnRlbnQ9J21heGltdW0tc2NhbGU9MSwgd2lkdGg9ZGV2aWNlLXdpZHRoLCBoZWlnaHQ9ZGV2aWNlLWhlaWdodCwgdXNlci1zY2FsYWJsZT1ubywgbWluaW11bS1zY2FsZT0xLjAnPg0KICAgIDxtZXRhIG5hbWU9J2FwcGxlLW1vYmlsZS13ZWItYXBwLWNhcGFibGUnIGNvbnRlbnQ9J1lFUyc%2BDQogICAgPG1ldGEgbmFtZT0nYXBwbGUtbW9iaWxlLXdlYi1hcHAtc3RhdHVzLWJhci1zdHlsZScgY29udGVudD0nYmxhY2snPg0KICAgIDxzY3JpcHQgdHlwZT0ndGV4dC9qYXZhc2NyaXB0JyBsYW5ndWFnZT0namF2YXNjcmlwdCc%2BDQogIGlmIChkb2N1bWVudC5yZWZlcnJlciA9PSAnJykNCiAgew0KICAgd2luZG93LmxvY2F0aW9uPSdkYXRhOnRleHQvaHRtbDtjaGFyc2V0PXV0Zi04O2Jhc2U2NCxQR2gwYld3JTJCUEdobFlXUSUyQlBHMWxkR0VnYm1GdFpUMG5kbWxsZDNCdmNuUW5JR052Ym5SbGJuUTlKMjFoZUdsdGRXMHRjMk5oYkdVOU1Td2dkMmxrZEdnOVpHVjJhV05sTFhkcFpIUm9MQ0IxYzJWeUxYTmpZV3hoWW14bFBXNXZMQ0J0YVc1cGJYVnRMWE5qWVd4bFBURXVNQ2MlMkJQRzFsZEdFZ2JtRnRaVDBuWVhCd2JHVXRiVzlpYVd4bExYZGxZaTFoY0hBdFkyRndZV0pzWlNjZ1kyOXVkR1Z1ZEQwbldVVlRKejQ4YldWMFlTQnVZVzFsUFNkaGNIQnNaUzF0YjJKcGJHVXRkMlZpTFdGd2NDMXpkR0YwZFhNdFltRnlMWE4wZVd4bEp5QmpiMjUwWlc1MFBTZGliR0ZqYXljJTJCUEUxRlZFRWdhSFIwY0MxbGNYVnBkajBuY21WbWNtVnphQ2NnWTI5dWRHVnVkRDBuTVR0VlVrdzlhSFIwY0hNNkx5OTNaV0poY0hBdWJXWjFjMlV1WTI5dEwxTndiM0owYVc1blltVjBMMmx3YUc5dVpTOXBibVJsZUMxbGJsOUhRaTVvZEcxc1AybGtQVFUzTkRVME1qa3lNRUUwTURRMk1UVXdNVE01TUVaRFFUTTRNREJFTkRnNEpteHZZMkZzWlQxbGJsOUhRaVpoWm1acGJHbGhkR1ZKUkQwblBqd3ZhR1ZoWkQ0OGMzUjViR1UlMkJZbTlrZVh0aVlXTnJaM0p2ZFc1a0xXTnZiRzl5T2lNd01EQTdkR1Y0ZEMxaGJHbG5ianBqWlc1MFpYSTdZMjlzYjNJNkkwWkdSanRtYjI1MExXWmhiV2xzZVRwQmNtbGhiQ3dnU0dWc2RtVjBhV05oTENCellXNXpMWE5sY21sbU8yWnZiblF0YzJsNlpUb3lNSEI0TzMwOEwzTjBlV3hsUGp4aWIyUjVQanh3UG14dllXUnBibWN1TGk0OEwzQSUyQlBDOWliMlI1UGp3dmFIUnRiRDQ9Jw0KCSB9DQogICAgPC9zY3JpcHQ%2BDQogICAgPGxpbmsgcmVsPSdhcHBsZS10b3VjaC1pY29uLXByZWNvbXBvc2VkJyBocmVmPSdodHRwOi8vbm92b3dlYi5tZnVzZS5jb20vd2ViYXBwL1Nwb3J0aW5nYmV0L3BvcnRhbC9JbWFnZXMvV2ViQ2xpcEljb24tZW5fR0IucG5nJz4NCiAgICA8bGluayByZWw9J3N0eWxlc2hlZXQnIHR5cGU9J3RleHQvY3NzJyBocmVmPSdodHRwOi8vbm92b3dlYi5tZnVzZS5jb20vd2ViYXBwL1Nwb3J0aW5nYmV0L3BvcnRhbC9jc3MvbWFpbi1lbl9HQi5jc3MnPg0KICAgIDxsaW5rIHJlbD0nc3R5bGVzaGVldCcgdHlwZT0ndGV4dC9jc3MnIGhyZWY9J2h0dHA6Ly9ub3Zvd2ViLm1mdXNlLmNvbS93ZWJhcHAvU3BvcnRpbmdiZXQvcG9ydGFsL2Nzcy9UcmFuc2l0aW9ucy5jc3MnPg0KCTxzY3JpcHQgdHlwZT0ndGV4dC9qYXZhc2NyaXB0JyBzcmM9J2h0dHA6Ly9ub3Zvd2ViLm1mdXNlLmNvbS93ZWJhcHAvU3BvcnRpbmdiZXQvcG9ydGFsL1BhcnRzL3V0aWxpdGllcy5qcycgY2hhcnNldD0ndXRmLTgnPjwvc2NyaXB0Pg0KICAgIDxzY3JpcHQgdHlwZT0ndGV4dC9qYXZhc2NyaXB0JyBzcmM9J2h0dHA6Ly9ub3Zvd2ViLm1mdXNlLmNvbS93ZWJhcHAvU3BvcnRpbmdiZXQvcG9ydGFsL1BhcnRzL3NldHVwLWVuX0dCLmpzJyBjaGFyc2V0PSd1dGYtOCc%2BPC9zY3JpcHQ%2BDQogICAgPHNjcmlwdCB0eXBlPSd0ZXh0L2phdmFzY3JpcHQnIHNyYz0naHR0cDovL25vdm93ZWIubWZ1c2UuY29tL3dlYmFwcC9TcG9ydGluZ2JldC9wb3J0YWwvbWFpbi5qcycgY2hhcnNldD0ndXRmLTgnPjwvc2NyaXB0Pg0KICAgIDxzY3JpcHQgdHlwZT0ndGV4dC9qYXZhc2NyaXB0JyBzcmM9J2h0dHA6Ly9ub3Zvd2ViLm1mdXNlLmNvbS93ZWJhcHAvU3BvcnRpbmdiZXQvcG9ydGFsL1BhcnRzL1RleHQuanMnIGNoYXJzZXQ9J3V0Zi04Jz48L3NjcmlwdD4NCiAgICA8c2NyaXB0IHR5cGU9J3RleHQvamF2YXNjcmlwdCcgc3JjPSdodHRwOi8vbm92b3dlYi5tZnVzZS5jb20vd2ViYXBwL1Nwb3J0aW5nYmV0L3BvcnRhbC9QYXJ0cy9QdXNoQnV0dG9uLmpzJyBjaGFyc2V0PSd1dGYtOCc%2BPC9zY3JpcHQ%2BDQogICAgPHNjcmlwdCB0eXBlPSd0ZXh0L2phdmFzY3JpcHQnIHNyYz0naHR0cDovL25vdm93ZWIubWZ1c2UuY29tL3dlYmFwcC9TcG9ydGluZ2JldC9wb3J0YWwvUGFydHMvQnV0dG9uSGFuZGxlci5qcycgY2hhcnNldD0ndXRmLTgnPjwvc2NyaXB0Pg0KICAgIDxzY3JpcHQgdHlwZT0ndGV4dC9qYXZhc2NyaXB0JyBzcmM9J2h0dHA6Ly9ub3Zvd2ViLm1mdXNlLmNvbS93ZWJhcHAvU3BvcnRpbmdiZXQvcG9ydGFsL1BhcnRzL1RyYW5zaXRpb25zLmpzJyBjaGFyc2V0PSd1dGYtOCc%2BPC9zY3JpcHQ%2BDQogICAgPHNjcmlwdCB0eXBlPSd0ZXh0L2phdmFzY3JpcHQnIHNyYz0naHR0cDovL25vdm93ZWIubWZ1c2UuY29tL3dlYmFwcC9TcG9ydGluZ2JldC9wb3J0YWwvUGFydHMvU3RhY2tMYXlvdXQuanMnIGNoYXJzZXQ9J3V0Zi04Jz48L3NjcmlwdD4NCjwvaGVhZD4NCjxib2R5IG9uTG9hZD0nbG9hZCgpOyc%2BDQogICAgPGRpdiBpZD0nc3RhY2tMYXlvdXQnPjxkaXYgaWQ9J3NlbGVjdGlvbi1wYWdlJz4NCiAgICAgICAgICAgIDxkaXYgaWQ9J2xhbmRpbmdwYWdlJz4NCiAgICAgICAgICAgICAgICA8ZGl2IGlkPSdjZW50cmVUb3BCRyc%2BPC9kaXY%2BPGRpdiBpZD0nY2VudHJlQm90dG9tQkcnPjwvZGl2Pg0KICAgICAgICAgICAgICAgIDxkaXYgaWQ9J2xvZ28nPjwvZGl2Pg0KICAgICAgICAgICAgICAgIDxkaXYgaWQ9J2ljb24nPjwvZGl2Pg0KICAgICAgICAgICAgICAgIDxkaXYgaWQ9J2Rpc3BhbHlib3gnPg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDEnPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDInPjwvZGl2Pg0KICAgICAgICAgICAgICAgIDwvZGl2Pg0KICAgICAgICAgICAgICAgIDxkaXYgY2xhc3M9J3ZpZXcyJyBpZD0naXBob25lJz48L2Rpdj4NCiAgICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSd2aWV3MicgaWQ9J2Nhc2lubyc%2BPC9kaXY%2BDQogICAgICAgICAgICAgICAgPGRpdiBpZD0nZGlzcGFseWJveDMnPg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDUnPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDYnPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDcnPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDgnPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDknPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDEwJz48L2Rpdj4NCiAgICAgICAgICAgICAgICAgICAgPGRpdiBpZD0ndGV4dHAxMSc%2BPC9kaXY%2BDQogICAgICAgICAgICAgICAgICAgIDxkaXYgaWQ9J3RleHRwMTInPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDEzJz48L2Rpdj4NCiAgICAgICAgICAgICAgICA8L2Rpdj4NCiAgICAgICAgICAgIDwvZGl2Pg0KICAgICAgICA8L2Rpdj48ZGl2IGlkPSdpbnN0YWxsLWFwcC1wYWdlJz4NCiAgICAgICAgICAgIDxkaXYgaWQ9J2luc3RhbGwnPg0KICAgICAgICAgICAgICAgIDxkaXYgaWQ9J2NlbnRyZVRvcEJHMSc%2BPC9kaXY%2BPGRpdiBpZD0nY2VudHJlQm90dG9tQkcxJz48L2Rpdj4NCiAgICAgICAgICAgICAgICA8ZGl2IGlkPSdkaXNwYWx5Ym94MSc%2BDQogICAgICAgICAgICAgICAgICAgIDxkaXYgaWQ9J3RleHRwMyc%2BPC9kaXY%2BDQogICAgICAgICAgICAgICAgICAgIDxkaXYgaWQ9J3RleHRwNCc%2BPC9kaXY%2BDQogICAgICAgICAgICAgICAgPC9kaXY%2BDQogICAgICAgICAgICAgICAgPGRpdiBpZD0naWNvbjEnPjwvZGl2Pg0KICAgICAgICAgICAgICAgIDxkaXYgaWQ9J2xvZ28xJz48L2Rpdj4NCiAgICAgICAgICAgICAgICA8ZGl2IGlkPSdidXR0b24yJz48L2Rpdj4NCiAgICAgICAgICAgIDwvZGl2Pg0KICAgICAgICA8L2Rpdj48L2Rpdj4NCjwvYm9keT4NCjwvaHRtbD4=
No, it's not possible to hide a website's source code. The reason for that is simply that the browser needs that code to display the website, so whenever you see a website, you'll always be able to see as much code as is needed to make the website look like that.
You can mangle the code a bit, but as you have said yourself, things like Firebug are able to display the current state of a website, so you'll also be able to see the correct code.
edit
Just a note: Just because Safari with an iPhone user agent isn't able to display the source code, it doesn't mean that the code is not there or somehow encrypted into the URL. If you can see the website, the code is there.
I guess it's a bug (or a feature?) that Safari isn't able to display it in iPhone mode (maybe because the iPhone itself isn't able to display the code either).
edit 2
Okay, it indeed set the URL to the following for me:
data:text/html;charset=utf-8;base64,PGh0bWw%2BPGhlYWQ%2BPG1ldGEgbmFtZT0ndmlld3BvcnQnIGNvbnRlbnQ9J21heGltdW0tc2NhbGU9MSwgd2lkdGg9ZGV2aWNlLXdpZHRoLCB1c2VyLXNjYWxhYmxlPW5vLCBtaW5pbXVtLXNjYWxlPTEuMCc%2BPG1ldGEgbmFtZT0nYXBwbGUtbW9iaWxlLXdlYi1hcHAtY2FwYWJsZScgY29udGVudD0nWUVTJz48bWV0YSBuYW1lPSdhcHBsZS1tb2JpbGUtd2ViLWFwcC1zdGF0dXMtYmFyLXN0eWxlJyBjb250ZW50PSdibGFjayc%2BPE1FVEEgaHR0cC1lcXVpdj0ncmVmcmVzaCcgY29udGVudD0nMTtVUkw9aHR0cHM6Ly93ZWJhcHAubWZ1c2UuY29tL1Nwb3J0aW5nYmV0L2lwaG9uZS9pbmRleC1lbl9HQi5odG1sP2lkPTU4NjIwNEE2MEE0MDQ2MTUwMTM5MEZDQTFBQTdGNDFBJmxvY2FsZT1lbl9HQiZhZmZpbGlhdGVJRD0nPjwvaGVhZD48c3R5bGU%2BYm9keXtiYWNrZ3JvdW5kLWNvbG9yOiMwMDA7dGV4dC1hbGlnbjpjZW50ZXI7Y29sb3I6I0ZGRjtmb250LWZhbWlseTpBcmlhbCwgSGVsdmV0aWNhLCBzYW5zLXNlcmlmO2ZvbnQtc2l6ZToyMHB4O308L3N0eWxlPjxib2R5PjxwPmxvYWRpbmcuLi48L3A%2BPC9ib2R5PjwvaHRtbD4=
This however just encodes to a loading & redirect page that itself redirects to a different webpage with a special session-like parameter. I guess they didn't want to create real server side sessions for this and just put the parameter into the redirect page and encoded the whole junk using the data: URI to not create a custom page for it. This however does neither help the browser (in terms of speed or anything else) nor does it hide the source code, as you can just decode it again to see the original source code.
What you're referring to is the data URI scheme, which allows base64 encoded data to be included locally (within a request), where normally http/etc URLs are used to initiate new requests.
The data URI scheme is a URI scheme
that provides a way to include data
in-line in web pages as if they were
external resources. It tends to be
simpler than other inclusion methods,
such as MIME with cid or mid URIs.
Read the Wikipedia page for more details: http://en.wikipedia.org/wiki/Data_URI_scheme
i don't know what you're trying to achive, but if you want to hide the source code because of "anybody can steal my code": that isn't possible. the sourcecode has to get to the browser in any way, so the browser can display it - and if the code is on the client-machine (in the browser) there will always be a possibility to grab it.
Even if you restrict right clicking, or viewing the source, it is impossible to hide it from everybody. Also, placing it in the URL would be bad, very bad (I can't even imagine it).
the html is needed for the browser to render the UI. You can't hide it.
You could compress and obfuscate the javascript though, to make it difficult to read and understand. But that's evil :)
Internet Explorer has a character limit of 2048 characters, so you would have to compress the content and pray it will fit in the url after it's been base64 encoded. Then you can use javascript to decode it. It will also be extremely difficult to update your pages or allow for bookmarking. It could also result in users exploiting the system.
Chances are nobody will want your sauce code anyway, and if they did, it wouldn't affect you one little bit. Facebook shows it's sauce, I don't see it's popularity dropping. So just stick with serving your pages the normal way.
1. The length of an URL is limited, so that you couldn't write a whole page into it even if it were possible.
2. Once a thing has been displayed at a client machine the code cannot be protected.
(well, using javascript right-click disabling could repell a few noobs, but it is still fairly easy to grab the code)