How to prevent html5 page from caching? - html

I converted a plain vanilla HTML page to HMTL5/CSS3 with a responsive layout, and for security reasons (dictated by the security people) the page must never cache.
The page previously used <meta http-equiv="Pragma" content="no-cache"> and <meta http-equiv="Expires" content="-1"> to prevent the page from being cached.
What replaces this in HTML5?
How do you prevent an html page from caching in the client?
I've spent a week reading about manifest files, but they seem to do exactly opposite of what I want as attaching a manifest file explicitly causes the page it is attached to to cache.
And please don't refer me back to the w3c definition of which meta elements are now allowed — I understand that HTML5 does not include the cache-control or Pragma in meta elements.
I need to know what it does include that will prevent a page from being cached.

In the beginning of code you need to use this:
<!DOCTYPE html>
<html manifest="manifest.appcache">
...
Then create manifest.appcache with such content:
CACHE MANIFEST
# Cache manifest version 1.0
# no cache
NETWORK:
*

I dislike appcache a tremendous amount. It almost works well but can be a real unforgiving pain. While doing some code refactoring, I realized that after logout, i could browse back to the the last page. Of course, refreshing browser it would force user back to login but this is not desired.
After searching around and seeing the options, I started to get a bit frustrated. I do not want to use appcache. I then realized that my code was redirecting to login page after destroying the session and got an idea, what if, I redirect to the home page instead? And voila, page was loaded, session checked (and of course not there), user redirected to login. Problem solved.

I have been struggling with the same issue for quite some time. What works for me - at least so far - in Chrome, FF and IE is doing the following:
1) reference the manifest file <html lang="nl" manifest="filename.appcache"> From what I understand, this will cache everything that follows in this HTML document, hence a manifest file is needed to prevent this from happening:
2) use a manifest file filename.appcache with the following content which basically says: for all files, do not read from cache but from network server:
CACHE MANIFEST
# 2015-09-25 time 20:33 UTC v 1.01
NETWORK:
*
3) a third step is required: each time you upload a (partial) update of your website, also change the manifest file by changing the date and time stamp in the comment(#) line. Why? Because if you do not change the manifest file, it will not be read and it will default to step 1 and thus cache and read from cache. The fact that the manifest file is changed, however, enforces the manifest file to be read again, and thus enforces that the "do not read from cache but read from network server" instruction therein, is applied again.

The previous answer may not consistently work to prevent caching or to clear existing cache in chrome but there is a work around.
1) To clear existing cache in chrome, it may be necessary to update all files of the website (eg by linking to a new css file on every page) along with an update of the cache-manifest before existing cache in chrome gets cleared upon the second visit of a page (because of the "flow" of the way in which a page is rendered: the first time a page is visited, the browser reads and caches the manifest , while proceeding with loading from existing cache. Only upon the second visit will the newly stored updated manifest be read and applied).
2) and if none of that helps, it is possible to include a script in the manifest file itself to check for a new manifest and if found, reload it and use the new manifest. This did the trick and resolved all remaining cases I tested for where files had remained persistently cached in chrome. I found this script on this page by Jason Stimpel.
<script type="text/javascript">
window.addEventListener('load', function (e) {
window.applicationCache.addEventListener('updateready', function (e) {
window.location.reload();
}, false);
}, false);
</script>

Related

Exclude page self by appcache

I have an appcache (with NETWORK *). So now I visit my page with <html manifest="/cache.appcache">. Then the page itself is cached as all the images are. But I want the page self to not be cached. How can I do this? I thought NETWORK * would do the trick.
Regards,
Kevin
The appcache manifest always caches the master page.
If you are using Chrome check the cached files for your page here: chrome://appcache-internals
A workaround could be to put a hidden iframe somewhere on your page, which contains the appcache file to cache offline content. (take a look at "Preventing the application cache from storing masters with an iframe" here: http://labs.ft.com/2012/11/using-an-iframe-to-stop-app-cache-storing-masters/ )
A better solution could be to write your page to fetch new content from your server when it is opened - if the server cannot be reached, it can serve the last known content from the HTML5 local storage.
I have tried the iframe work around, and find it ripe with errors. Most browsers cache the data for the iframe where the page cannot get it.
Instead make the page's content load via AJAX. Basically have a blank html page with the manifest and javascript which pulls and adds its content from the server. This way only the blank html is cached, and content is always updated from the server.
Converting a page to this method can be very difficult, but it works. Making sure the appropriate javascript gets run at the correct time, probably requires some detangling. Moving around server code which won't be called when pulling from cache to the new ajax method.
Note: no need to pull conditional content from the server if the condition is in the query string, different query strings make a separate cache

My HTML5 Application Cache Manifest is caching everything

UPDATE:
** I posted this question when this feature was really new, I realize now that this feature should not be used this way unless it is used via JavaScript. but seems like this hack is a great solution for most beginners who make the same mistake and misuse of this feature. If you want to cache everything except your HTML this should be done with JS or you could use the solution below **
I guess my question boils down to this:
If the file referencing the manifest using the manifest attribute of the HTML tag falls under the MASTER CACHE ENTRIES how could you have a dynamic page use the manifest.
My file looks like this:
CACHE MANIFEST
CACHE:
# IMAGES:
/stylesheets/bg.jpg
/stylesheets/cont_bg.png
#and so forth..
#EXTERNAL
http://chat.mydomain.com/themes/images/panel_bg.png
http://chat.mydomain.com/themes/images/images_core.png
####################################
#STYLE SHEETS:
/stylesheets/min.css
/stylesheets/css_night.aspx
#####################################
#JAVASCRIPT:
/JAVASCRIPT/header_javascript.js
#EXTERNAL:
http://ajax.googleapis.com/ajax/libs/jqueryui/1.8.9/jquery-ui.min.js
http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js
FALLBACK:
/ /offline.php
NETWORK:
*
Now the problem is once I browse a page not in the manifest, my actual dynamic php files like index.php, when I first see the page and there in no cache chrome goes:
Adding master entry to Application Cache with manifest http://208.109.248.197/manifest.appcache
Application Cache Downloading event
Application Cache Progress event (0 of 28)
...
Application Cache Progress event (28 of 28)
Application Cache NoUpdate event
So far so good until I actually load a page, and chrome goes:
Application Cache UpdateReady event
Adding master entry to Application Cache with manifest http://mydomain.com/manifest.appcache
now as you can see in the last line it adds index.php to my application cache and I have verified this by going to url: chrome://appcache-internals/
It says:
Flags URL Size (headers and data)
Explicit, http://mydomain/JAVASCRIPT/header_javascript.js 57.5 kB
Master, http://mydomain/home.php 51.2 kB
Master, http://mydomain/index.php 53.5 kB
Master, Fallback, http://mydomain/offline.php 49.4 kB
where things like index.php and home.php are not supposed to be cached. I would like to tell it to not cache any html extensions if possible. But here is what I have learned from various RFC I believe:
An online whitelist wildcard flag, which is either open or blocking.
The open state indicates that any URL not listed as cached is to be implicitly treated as being in the online whitelist namespaces; the blocking state indicates that URLs not listed explicitly in the manifest are to be treated as unavailable.
well I would like to use one of these online white-list wildcard flags and set it to blocking but I can not find any explanations or examples further more.
I also read:
zero or more URLs that form the online whitelist namespaces.
These are used as prefix match patterns, and declare URLs for which the user agent will ignore the application cache, instead fetching them normally (i.e. from the network or locale HTTP cache as appropriate).
I would also like to use some pattern like this but then again I can find no documentation. Why is there no sign of appcache manifest documentation and no other website I've been to is using it , since my chrome appcache directory shows none!?!?
Thank you for your time!
Here is a hack I found out by playing around:
I haven't found the ultimate answer but from what I've learned it seems that the manifest is not meant to be set on every page. Again I'm not sure but this is a hack that I came across. I have a page such as manifest.html that has the
<html manifest="manifest.appcache">
I learned that pages which do not have this will not be added to the cache however they will still continue using the application cache if on the same domain. Therfore if you include manifest.html a plain html page that has this in an iframe on everypage it will not cache that page like chrome will no longer output:
Adding master entry to Application Cache with manifest
but if you go to network tab you will see that it is using the cache
<iframe id='manifest_iframe_hack'
style='display: none;'
src='temporary_manifest_hack.html'>
</iframe>
content of temporary_manifest_hack.html:
<!DOCTYPE HTML>
<html lang="en" class="no-js" manifest="manifest.appcache">
<head>
<meta charset="utf-8">
<title>Hack 4 Manifest</title>
</head>
<body></body>
</html>
The appcache always contains the page that contains the manifest attribute in the html tag.
If you want that page itself to be dynamic, you must load content into it with an ajax call to a service that is in the NETWORK section.
I guess the Iframe-workaround doesn't work. If you think the files are loaeded from appcache: no. they're coming from browser cache.
disable browsercache in devtools-settings und look at "network". you can see, that all elements will be loaded via network and don't come frome (app)cache.

Should HTML 5 cache manifest work with ajax requests too?

I'm trying to get HTML 5 offline application cache working with an ASP MVC 3 website. The problem I get is that
when I try to navigate to a page in offline mode, it doesn't work.
I am using an action for the manifest file so that it can be dynamically generated, and in the view I specify
the Resonse.ContentType = "text/cache-manifest".
I have hosted the application locally in IIS so I'm using http://192.168.55.127/mywebsite/ to access it.
This is the manifest view I'm using. It uses the razor view engine and is a bit messy (hard coded URL etc)
while I try to figure out what's wrong.
#{
Layout = null;
Response.ContentType = "text/cache-manifest";
}
CACHE MANIFEST
# Version: #ViewBag.Version
CACHE:
#Script Files
#foreach(var jsFile in Url.GetJsFiles())
{
#string.Format("{0}{1}\r\n", "http://192.168.55.127", Url.Content(jsFile))
}
#Style Sheets
#foreach(var cssFile in Url.GetCssFiles())
{
#string.Format("{0}{1}\r\n", "http://192.168.55.127", Url.Content(cssFile))
}
#Images
#foreach(var imageFile in Url.GetImageFiles())
{
#string.Format("{0}{1}\r\n", "http://192.168.55.127", Url.Content(imageFile))
}
#HTML Pages
#string.Format("{0}{1}", "http://192.168.55.127", Url.Content("~/pages/master.htm"))
#string.Format("{0}{1}", "http://192.168.55.127", Url.Content("~/pages/home.htm"))
#string.Format("{0}{1}", "http://192.168.55.127", Url.Content("~/pages/options.htm"))
NETWORK:
*
This results in paths such as:
http://192.168.55.127/mywebsite/scripts/Libs/jQuery.js
http://192.168.55.127/mywebsite/pages/home.htm
which seems to be fine.
I have referenced the manifest file using the full path too:
<html manifest="http://192.168.55.127/mywebsite/manifest">
which seems to be ok, as when I load the site up in chrome and observe the developer console, it appears
to cache all the files without throwing any errors. Also if I navigate to http://192.168.55.127/mywebsite/manifest
it serves up the manifest as I'd expect to see it.
The website doesn't use normal navigation, instead it navigates using hash fragments - so to navigate to home the url would be master.htm#home or for options it would be master.html#options. This hash change is picked up by javascript and it loads the page into a div container in the master using ajax, more specifically it uses the 'load' method in jQuery to do this.
This all works fine when not in offline mode, and when observing the network tab in chrome when navigating, the request URL is correct and is the same URL that is listed in the manifest file. The only thing I can think of is that offline mode doesn't work for ajax request, but I was under the impression that it worked the same.
I am testing offline mode using FireFox (version 9.0) by clearing down all history, browsing to the website home page, enabling offline mode, then trying to navigate to the options page. In firebug I see a GET request for the correct URL of the options page but it never returns, it doesn't even error. The loading wheel (next to the request in the net tab in firebug) just keeps turning as if it is still loading. I tried it in Opera 11.60 too (as that also has an offline mode) and the same kind of thing happens.
Any one have any ideas as to what I'm doing wrong? Have I missed something obvious or misunderstood how the manifest should work? Any suggestions will be appreciated.
(I know the question's old but for future reference...)
If the AJAX content files are listed in the AppCache manifest file properly (which they seem to be) then this should work. Personally, I'd use relative rather than absolute paths but that shouldn't make a difference.
Your problem seems to be that the manifest file doesn't have a file extension. Try renaming the file (and its reference in master.htm) to appcache.manifest or similar. Then you need to make sure the manifest file's MIME type is set in the server. E.g. for Apache you'd add something like:
AddType text/cache-manifest .manifest
to the server's config file or your .htaccess file.
Also, as well as clearing cached data when testing, make sure you refresh the page at least a couple of times when you make a change to the manifest file because the browser checks for updates and downloads files in separate page loads.
Finally, it won't work if the files you're pulling in with AJAX have parameters in the URL, e.g. ?id=1234, but are not listed as such in the manifest file. That doesn't seem to be the case here but it's something to be aware of.

HTML5 Cache fallback

I am experimenting with HTML5 caching and i have stumbled onto a problem.
CACHE MANIFEST
/Default.aspx
/Offline.aspx
/js/jquery-1.6.4.min.js
/js/jquery.mobile-1.0rc2.min.js
/css/jquery.mobile-1.0rc2.min.css
/css/images/ajax-loader.png
/css/images/icons-18-white.png
FALLBACK:
/ Offline.aspx
NETWORK:
*
So my starting page is Default.aspx, when the device goes offline it should redirect to /Offline.aspx but it doesn't. Now all i can figure is because /Default.aspx is cached.
Now let's say i remove /Default.aspx from the manifest, It would still be cached because it's referencing the manifest in the HTML tag.
I have read dozens of pages concerning html caching but i can't find an answer.
Any advice would be great!
Thanks
Yes, this is the behavior that you should expect because if the page that references the manifest is not declared in the manifest itself (explicit), it will be considered part of the manifest implicitly as a "master" page - and from that point forward will be cached and not updated until the manifest changes.
This was not entirely clear to me either until I experienced that same behavior (in the application that I was adding offline capabilities to) and dug into the spec to better understand the observed behavior.
My solution to this was to turn the dynamic parts of that page into separate Ajax calls so that even though the page was cached (implicitly or explicitly) the parts of it that updated continued to be updated through the (non-cached) Ajax calls. However, you'll want to create fallback entries for said Ajax calls if you want them to behave nicely when offline (or handle the resulting Ajax errors if not).

HTML5: Offline Cache is ignored if loading different page under the same URL

I have a web application that has a constant URL and internal state machine. The states are changed via posts. I know it is a bad design and I should use the rest approach. But given this I have a following problem.
I use HTML5 offline cache (the manifest attribute in HTML tag). For the first page it is parsed and cached as I would expect (login page). But for the second page (main menu) the manifest included there is not parsed. No events are shown inside Chrome browser. If I change the URL a little by including a parameter then the manifest is parsed, but not before.
Event if I include everything in the login page manifest the second page downloads the same files again. Event if they are specified in the manifest for the first page.
Why this behaviour?
To answer it myself. It was looking so odd, simply because the cache is only parsed on GET calls and ignored on POST calls. Event if post loads another HTML page. To me this is a little bit silly but it seems that is how it works.
Now it finally works as it is supposed to.