How can I get rid of the HTML5 offline cache? - html

I have an application which used to use the HTML5 offline cache. Now I've decided to not use it anymore and removed the manifest attribute from the index.html file. However, browsers still regard this site as cached and refuse to update the index.html file.
Even updating the manifest doesn't help. How can I remove the site from the user's offline caches? Am I stuck with a cached web site forever?

You need to make sure the manifest file isn't being cached, which by default it will be.
Adding
ExpiresActive On
ExpiresDefault "access"
To your .htaccess will stop everything being cached, though you really just want the manifest file to be cached in this way like this: (remember to update filename)
<Files cache.manifest>
ExpiresActive On
ExpiresDefault "access"
</Files>
You really need to do that first, but this should alleviate the problem.
I'd recommend reading through Mark Pilgrim's page on this as well.

Try changing contents of your manifest to simply CACHE MANIFEST with no files listed. The clients should retrieve the new manifest next time they hit the site and their cache should be removed.
Note however that they won't be using this new, empty manifest until they refresh the page.
I've found that in some cases on some browsers they don't necessarily grab the new manifest right away. This behavior seems inconsistent though. When this happens I tend to clear their caches / offline storage manually in order to force them to update (though I understand you can't necessarily get users to do this).

Related

How to stop using cache manifest on a live site

I changed my cache.manifest file to the following:
CACHE MANIFEST
NETWORK:
*
CACHE:
FALLBACK:
This triggered an update to my site.
I tried adding a "#" to the manifest file and then removed the manifest="cache.manifest"from my page.
This triggered the cache to be updated again, even though the reference to the manifest was gone. The console indicated it was still being loaded from cache.
I even tried renaming my cache.manifest file and it still was being loaded from cache.
How in the world can I safely stop using cache manifest? I have a completely new version of my site I want to roll out, but if cache manifest is still trying to cache the new site, that will be a disaster for all my visitors who do not know they need to clear their browser cache.
I believe I found the trick.
It seems I need to keep the old cache.manifest but change (not remove) the reference to a non-existent manifest file.
Presumably, I'll need to keep the broken reference there for several years in case I have visitors who only visit periodically.

How to publish changes to HTML5 cached files?

I know how to tell the browser to download cached fuiles again by simply changing a character in the appcache manifest, but when I do that, how can I make sure the browser downloads the new file without doing things like changing filenames?
I am aware of file expiration headers I can send, but I have no experience with them. Would they even work with HTML5 caching? Which ones do I send?
I'm under the impression that browsers aren't smart enough to detect when a file is modified, and will continue using the cached file until you force it by refreshing the page or changing the filename. I don't want to do that since it also means updating the manifest and is just extra work.
My optimal solution is to change the manifest slightly, then the browser goes and fetches any changed files without me forcing it to.
What I do is add the timestamp as a comment in the manifest
# 20130623 025200
And just update it each time I want to force a refresh.
EDIT: As I noted in the comments, The browser will re-download all files explicitly enumerated in the manifest. For files that are not in the manifest (for example, CSS or images referenced in an HTML file but not in the manifest), the default expiry take precedence.
The algorithm is described in the standard: http://www.w3.org/html/wg/drafts/html/master/browsers.html#downloading-or-updating-an-application-cache

How do I specify a wildcard in the HTML5 cache manifest to load all images in a directory?

I have a lot of images in a folder that are used in the application. When using the cache manifest it would be easier maintenance wise if I could specify a wild card to load all the images or files in a certain directory to be cached.
E.g.
CACHE MANIFEST
# 2011-11-3-v0.1.8
#--------------------------------
# Pages
#--------------------------------
../index.html
../edit.html
#--------------------------------
# JavaScript
#--------------------------------
../js/jquery.js
../js/main.js
#--------------------------------
# Images
#--------------------------------
../img/*.png
Can this be done? Have tried it in a few browsers with ../img/* as well but it doesn't seem to work.
It would be easier, but how's it going to work? The manifest file is something which is parsed and acted upon in the browser, which has no special knowledge of files on your server other than what you've told it. If the browser sees this:
../img/*.png
What is the first image the browser should request from the server? Let's start with these:
../img/1.png
../img/2.png
../img/3.png
../img/4.png
...
../img/2147483647.png
That's all the images that might exist with a numeric name, stopping semi-arbitrarily at 231-1. How many of those 2 billion files exist in your img directory? Do you really want a browser making all those requests only to get 2 billion 404s? For completeness the browser would probably also want to request all the zero-filled equivalents:
../img/01.png
../img/02.png
../img/03.png
../img/04.png
...
../img/001.png
../img/002.png
../img/003.png
../img/004.png
...
../img/0001.png
../img/0002.png
../img/0003.png
../img/0004.png
...
Now the browser's made more than 4 billion HTTP requests for files which mostly aren't there, and it's not yet even got on to letters or punctuation in constructing the possible filenames which might exist on the server. This is not a feasible way for the manifest file to work. The server is where the files in the img directory are known, so it's on the server that the list of files has to be constructed.
I don't think it works that way. You'll have to specify all of the images one by one, or have a simple PHP script to loop through the directory and output the file (with the correct text/cache-manifest header of course).
It would be a big security issue if browsers could request folder listings - that's why Tomcat turns that capability off by default now.
But, the browser could locate all matches to the wildcards referenced by the pages it caches. This approach would still be problematic (like, what about images not initially used but set dynamically by JavaScript, etc., and it would require that all cached items not only be downloaded but parsed as well).
If you are trying automate this process, instead of manually doing it. Use a script, or as I do I use manifestR. It will output your manifest/appcache file and all you have to do is copy and paste. I've used it successfully and usually only have to make a few changes.
Also, I recommend using the network header with the wild card:
NETWORK:
*
This allows all assets from other linked domains via JSON, for instance, to download into the cache. I believe that this is the only header where you can specify a wildcard. Like the others have said here, it's for security reasons.
The cache manifest is now deprecated and you should use HTML headers to control caching.
For example:
<meta http-equiv="Cache-control" content="public">
Public - may be cached in public shared caches.
Private - may only be cached in private cache.
No-Cache - may not be cached.
No-Store - may be cached but not archived.

HTML5 Offline Appcache Updates Not Showing In Firefox

I have an index.php file in my docroot. It produces output that starts with this:
<!DOCTYPE html>
<html manifest="manifest.appcache">
The manifest.appcache tells browsers to cache it for offline use. Again, the relevant parts:
CACHE MANIFEST
#version 8-25-2011
CACHE:
#internal HTML documents
#this tells the browser to cache the HTML it retrieves from http://example.com/
/
NETWORK:
*
Offline access is working fine with this setup, but updating is not working as I would expect in Firefox.
In Chrome and Safari, when I update the index.php file and then change a comment in the cache.manifest file, the browsers will grab the new index.php output and use that in the cache.
However, in Firefox, it seems to not care that I've updated the manifest.appcache file. I suspect that if I wait long enough, it will update, but I've tried waiting hours.
How can I find and eliminate my caching problem?
What HTTP cache headers are you sending with your index.php file? If you've not set things like the Cache-control: and Expires: headers then Firefox could be refreshing the application cache version of the page from it's regular cache instead of requesting it again from the server.
EDIT BY POSTER OF THE QUESTION:
For anyone that wants to know what exactly it took, here's what I put in my .htaccess file based on this answer and a perusal of http://www.diveintohtml5.info/offline.html:
<Files *.appcache>
ExpiresActive On
ExpiresDefault "access"
</Files>
Hope that helps the next person!
I know I'm really late to the party but I've been seeing this issue in Firefox for years and was hoping that the underlying bug would be fixed.
Unfortunately that hasn't happened but I've finally come up with a workaround. In my case, whilst the new .appcache file is loaded and processed, a page reload does not cause the newly cached versions to be used. The process I'm using goes as follows:
index.html is loaded and specifies the .appcache file in the html tag.
The .appcache file is generated dynamically using a PHP script. The script hashes all included files to create a unique version hash, which is included in the manifest. This means a change in any files listed in the manifest forces a cache reload.
My .htaccess file has the following to prevent the .appcache manifest from being cached:
<Files *.appcache>
ExpiresActive On
ExpiresDefault "access plus 0 seconds"
</Files>
My Javascript code detects the appcache update and reloads the page once the updated files have been fetched:
appCache.addEventListener('updateready', function(e) {
console.log("Appcache update finished, reloading...");
setLoadingBar(100, "Loading...");
appCache.swapCache();
location.reload();
});
Once the page reloads, the old cache is still used in Firefox until the cache is manually cleared by the user. In all other browsers I've tested, the newly cached files take immediate affect.
The fix turned out to be painfully simple!
All that was needed was to change the location.reload() line to include the true parameter:
location.reload(true)
This seems to indicate that Firefox serves the files from it's normal cache rather than using the appcache stored files, even when the appcache'd files are newer. I'm guessing this is because Firefox puts the normal caching mechanism in front of appcache like so:
Request -> Normal cache -> Appcache -> Network request
But that's just a guess.

crossdomain.xml prevent caching

After changing domain name where flash application being hosted I should change crossdomain.xml file. That crossdomain.xml is hosted on api-server, which is used by flash application. I see that flash uses crossdomain.xml from browser's cache. Is there any trick to make flash to not get crossdomain.xml from cache? Maybe there is any parameter, that I can pass to flash during it's call in object tag?
Annoying issue - no doubt.
First of all: I like Caching - as long as I'm in control.
This is, how I gain control over crossdomain.xml caching:
Let's say, we have a flash app that requires some input from a different server.
In my case we have this configured as a flashvar dataSrc=http://www.company.com/data/calendar.xml
So flash is looking for
www.company.com/crossdomain.xml
... which is loaded once and than taken from users browser cache until he manually flushes it.
The solution is in changing the subdomain the crossdomain.xml ist taken from:
Make sure, that for example(!) noCache.company.com/ points to company.com's documentRoot.
Flashvar is modified to dataSrc=http://noCache.company.com/data/calendar.xml. In fact, you're addressing the same file as before.
Flash is looking for noCache.company.com/crossdomain.xml which will be taken from the Server now because there is no cached file for that uri.
It's up to your fantasy... if you allow subdomains like noCache_{numeric_value}, you could easily handle your own TTL by accessing http://noCache_{week_of_year}.company.com/data/calendar.xml ...
You can as well increment that numeric value each time crossdomain.xml has changed.
Use following apache directives to specify caching policy for the file:
<Directory /var/www/mysite>
<FilesMatch "crossdomain.xml">
Header set Cache-Control "max-age=86400, public, must-revalidate"
</FilesMatch>
</Directory>
I append random numbers to the end of xml files if I don't want them to cache
eg. var myXMLURL:String = "myXML.xml?r=" + Math.random()*1000;
The browser sees it as a different file eg. myXML.xml?r=645 / myXML.xml?r=239
I'm not sure if this would work with crossdomain.xml files, but it should be worth a quick try.
I would forced reload (F5 or CTRL/CMD-F5) the crossdomain.xml file directly in the browser until I see it changes. Just type in the crossdomain file's URL in the browser and keep on refreshing. Also I would clean the browser cache.
You should try Firefox and firebug which shows you whether the downloaded files are cached or not.
http://getfirebug.com/
Good luck,
Rob