How to publish changes to HTML5 cached files? - html

I know how to tell the browser to download cached fuiles again by simply changing a character in the appcache manifest, but when I do that, how can I make sure the browser downloads the new file without doing things like changing filenames?
I am aware of file expiration headers I can send, but I have no experience with them. Would they even work with HTML5 caching? Which ones do I send?
I'm under the impression that browsers aren't smart enough to detect when a file is modified, and will continue using the cached file until you force it by refreshing the page or changing the filename. I don't want to do that since it also means updating the manifest and is just extra work.
My optimal solution is to change the manifest slightly, then the browser goes and fetches any changed files without me forcing it to.

What I do is add the timestamp as a comment in the manifest
# 20130623 025200
And just update it each time I want to force a refresh.
EDIT: As I noted in the comments, The browser will re-download all files explicitly enumerated in the manifest. For files that are not in the manifest (for example, CSS or images referenced in an HTML file but not in the manifest), the default expiry take precedence.
The algorithm is described in the standard: http://www.w3.org/html/wg/drafts/html/master/browsers.html#downloading-or-updating-an-application-cache

Related

How do browsers determine the manifest has been updated?

I have noticed that a touch, or another modification of the manifest file's metadata will not trigger an update (at least when testing with Google Chrome). The browser will fire the noupdate event unless I change something more meaningful in the file, such as adding a line of whitespace.
How does the browser compare the old manifest against the new? Does it download the new manifest and compare it? Is it determined somehow from the file header?
http://diveintohtml5.info/offline.html includes a good explanation of the process that the browser follows when loading the manifest.
Briefly:
the server attempts to get a new copy of the manifest
if you have an old version, it will compare the content of the manifest file to the old manifest - so if your index page has changed, but you have no new files, it'll fire noupdate. Any change to the manifest contents, as trivial as adding whitespace (as you saw), will cause your listed assets to be redownloaded.
a common way to deal with this is to add a SHA or something as a comment to your manifest, generated in such a way that it'll change when any of the assets themselves change.

HTML cache manifest download all files

It seems that if the cache manifest on the server changes it will re-download everything in file? Is there any way to only make it re-download some of the files? If I only make code fixes to a script and then update the manifest (say just put a timestamp comment in it) to force local copies to see a change has been made, I might not want images re-downloaded that haven't changed but still need to be in the manifest. Is there a way to get more precision around this or is it an all or nothing thing?
From what i've expreienced, updating the cache manifest doesn't explicitly download all the content again, but rather checks if they've been modified. So if there are 10 items in your cache manifest and you were to update a single file (and also the manifest file), then each file would be checked for modification, and depending on how they are being served (CDN?) they should be returning a 304 Not Modified, and thus not be downloaded again.

Cache Manifest messes up my app when online

Gurus of SO
I am trying to play with CACHE MANIFEST/HTML5. My app is JS heavy and built on jquery/jquerymobile.
This is an excerpt of what my Manifest looks like
CACHE MANIFEST
FALLBACK:
/
NETWORK:
*
CACHE:
/css/style.css
/js/jquery.js
But somehow, the app doesn't load the files the first time itself and the entire app breaks down.
Is my format wrong?
Should I never load JS into the Cache?
How should I treat this differently to always check the network first if anything isn't available and only load stuff available from the Cache?
Thank you.
I tried a simple page with your cache manifest and it worked fine for me, so I'm not really sure what the problem is. But,
Yes, there is something wrong with the format. The entries in the FALLBACK section need to have two parts: a pattern, and a URL. This says "if any page matching the pattern is not available offline, display the URL instead (which will be cached)." The main example of this (as shown here) is "/ /offline.html", which means "for all pages, if we are offline and they are not cached, display /offline.html instead." However, I don't think this is the source of your problem since I tested it with your exact manifest and it still worked.
There is nothing special about JS files. It should be fine to load them into the cache.
I don't understand the third question. There are possibly two goals here: a) how do you check to see if there is a newer version of the file available online first, before going back to the cache, and b) how do you check the network to see if there is a file that is not cached, and if we are offline, fall back to an error page. The answer to (a) is that once you have turned on the cache manifest, things work very differently. It will never check for new versions of the files unless there is a new version of the manifest also. So you must always update the manifest whenever you change any files. The answer to (b) is the FALLBACK section.
See Dive Into HTML5's excellent chapter on this, particularly the section "The fine art of debugging, a.k.a. “Kill me! Kill me now!”" which explains how the manifest updates.
Also I don't think we've gotten to the meat of your question, because it's unclear what you mean by "the app doesn't load the files the first time itself". Which files don't load? Do they load properly after a refresh? Etc.
The only way I got this to work to refresh a cache was to rename the manifest file with a commit number or timestamp, and change the cache declaration to
<html manifest='mymanifest382330.manifest'>
I made this part of my build.

HTML5 Cache -- Is it possible to have several distinct caches for a single URL?

Every URL can be linked to a single cache manifest. But I want several cache manifests linked to a same URL. Here is the reason:
Some files I want to be cached are rarely updated and large.
So everytime the cache gets updated these large files get re-downloaded even though they may not have been changed.
So I want to split up the cache. One cache for theses rarely updated large files and another cache for the often updated light files.
Do you guys have any idea how to split up an HTML5 cache?
The most efficient way is:
a) Use far-future expiration date (max-age) on all resources mentioned in manifest's CACHE section and add timestamp suffix to each file in the CACHE section, e.g.:
CACHE:
menu_1355817388000.js
toolbar_1355817389100.js
b) When any of the above files change on the server, regen/update manifest to change the timestamp. Only the file with the modified timestamp will get downloaded next time. Mission accomplished.
Note: Reload the page twice in the browser, as on the first refresh browser downloads just the manifest and uses old cached resources to paint the page. This is done to speed up displaying the page (there are tricks to handle this issue of double refresh, but they are outside the scope of your question)
See more info in this long but best article I ever seen on appcache.
Use an iframe
Your page's cache manifest would include the light files and the cache manifest of an iframe loaded by this page would include the large files
On chrome the iframe's application cache will also be used for the page. I didn't tested this method on other browsers yet.
see a live example at http://www.timer-tab.com and if you are using chrome see its split up cache at chrome://appcache-internals/
When the manifest file is changed and the files of the application cache are downloaded again, the normal HTTP caching rules still apply. This means that if you set the correct HTTP caching headers for these large files, you'll get a 304 so these files are not downloaded again. So it's not necessary to split the application cache.
Maybe an answer but I'd more like to shed some light on my findings as a I troubleshoot my own webapp.
I've discovered that I can use 2 iframes (manifest_framework) and (manifest_media) to load the manifests, but i'm still not exactly clear how they are targetted, but I had limited success.
manifest_framework:
CACHE MANIFEST
CACHE:
appdata.ini
dialog.png
jquery.min.js
login.htm
login.js
manifest.appcache.js
NETWORK:
*
FALLBACK:
manifest_media:
CACHE MANIFEST
CACHE:
manifest_fwk.php
od/audio_track_1_1.m4a
od/audio_track_1_2.m4a
od/audio_track_1_3.m4a
od/audio_track_1_4.m4a
od/video_1.mp4
od/video_2.mp4
od/video_3.mp4
NETWORK:
*
FALLBACK:
./ webapp.php
./index.php is the page the 'landing page' which itself isn't cached but falls back to webapp.php when offline.
What I don't understand is how these link to the webapp.php page.
I am finding I can only get access to one or the other manifests cache.
The above works in mobile safari, the media would be cached, and image but not necessarily the JS or images in the framework manifest.
Anyone have more examples where multiple manifests are referenced from the one URL/page?
The W3C working group has abandoned the file system api, so it SHOULD NOT BE USED anymore.
We'll likely see it fall off the next version of Chrome.
http://www.w3.org/TR/file-system-api/
CACHE MANIFEST
# This is a comment.
# Cache manifest version 0.0.1
# If you change the version number in this comment,
# the cache manifest is no longer byte-for-byte
# identical.
demoimages/mypic.jpg
demoimages/yourpic.jpg
demoimages/ourpic.jpg
sr/scroll.js
NETWORK:
# All URLs that start with the following lines
# are whitelisted.
# whitelisted items are needed to help the site function, you could put regularly
# changing items here
http://example.com/examplepath/
http://www.example.org/otherexamplepath/
CACHE:
# Additional items to cache.
demoimages/allpics.jpg
FALLBACK:
demoimages/currentImg.jpg images/stockImage.jpg`
If the Iframe trick does not work, use the HTML5 FileSystem API
See http://updates.html5rocks.com/2012/04/Taking-an-Entire-Page-Offline-using-the-HTML5-FileSystem-API

How can I get rid of the HTML5 offline cache?

I have an application which used to use the HTML5 offline cache. Now I've decided to not use it anymore and removed the manifest attribute from the index.html file. However, browsers still regard this site as cached and refuse to update the index.html file.
Even updating the manifest doesn't help. How can I remove the site from the user's offline caches? Am I stuck with a cached web site forever?
You need to make sure the manifest file isn't being cached, which by default it will be.
Adding
ExpiresActive On
ExpiresDefault "access"
To your .htaccess will stop everything being cached, though you really just want the manifest file to be cached in this way like this: (remember to update filename)
<Files cache.manifest>
ExpiresActive On
ExpiresDefault "access"
</Files>
You really need to do that first, but this should alleviate the problem.
I'd recommend reading through Mark Pilgrim's page on this as well.
Try changing contents of your manifest to simply CACHE MANIFEST with no files listed. The clients should retrieve the new manifest next time they hit the site and their cache should be removed.
Note however that they won't be using this new, empty manifest until they refresh the page.
I've found that in some cases on some browsers they don't necessarily grab the new manifest right away. This behavior seems inconsistent though. When this happens I tend to clear their caches / offline storage manually in order to force them to update (though I understand you can't necessarily get users to do this).