HTML5 Offline Appcache Updates Not Showing In Firefox - html

I have an index.php file in my docroot. It produces output that starts with this:
<!DOCTYPE html>
<html manifest="manifest.appcache">
The manifest.appcache tells browsers to cache it for offline use. Again, the relevant parts:
CACHE MANIFEST
#version 8-25-2011
CACHE:
#internal HTML documents
#this tells the browser to cache the HTML it retrieves from http://example.com/
/
NETWORK:
*
Offline access is working fine with this setup, but updating is not working as I would expect in Firefox.
In Chrome and Safari, when I update the index.php file and then change a comment in the cache.manifest file, the browsers will grab the new index.php output and use that in the cache.
However, in Firefox, it seems to not care that I've updated the manifest.appcache file. I suspect that if I wait long enough, it will update, but I've tried waiting hours.
How can I find and eliminate my caching problem?

What HTTP cache headers are you sending with your index.php file? If you've not set things like the Cache-control: and Expires: headers then Firefox could be refreshing the application cache version of the page from it's regular cache instead of requesting it again from the server.
EDIT BY POSTER OF THE QUESTION:
For anyone that wants to know what exactly it took, here's what I put in my .htaccess file based on this answer and a perusal of http://www.diveintohtml5.info/offline.html:
<Files *.appcache>
ExpiresActive On
ExpiresDefault "access"
</Files>
Hope that helps the next person!

I know I'm really late to the party but I've been seeing this issue in Firefox for years and was hoping that the underlying bug would be fixed.
Unfortunately that hasn't happened but I've finally come up with a workaround. In my case, whilst the new .appcache file is loaded and processed, a page reload does not cause the newly cached versions to be used. The process I'm using goes as follows:
index.html is loaded and specifies the .appcache file in the html tag.
The .appcache file is generated dynamically using a PHP script. The script hashes all included files to create a unique version hash, which is included in the manifest. This means a change in any files listed in the manifest forces a cache reload.
My .htaccess file has the following to prevent the .appcache manifest from being cached:
<Files *.appcache>
ExpiresActive On
ExpiresDefault "access plus 0 seconds"
</Files>
My Javascript code detects the appcache update and reloads the page once the updated files have been fetched:
appCache.addEventListener('updateready', function(e) {
console.log("Appcache update finished, reloading...");
setLoadingBar(100, "Loading...");
appCache.swapCache();
location.reload();
});
Once the page reloads, the old cache is still used in Firefox until the cache is manually cleared by the user. In all other browsers I've tested, the newly cached files take immediate affect.
The fix turned out to be painfully simple!
All that was needed was to change the location.reload() line to include the true parameter:
location.reload(true)
This seems to indicate that Firefox serves the files from it's normal cache rather than using the appcache stored files, even when the appcache'd files are newer. I'm guessing this is because Firefox puts the normal caching mechanism in front of appcache like so:
Request -> Normal cache -> Appcache -> Network request
But that's just a guess.

Related

How to stop using cache manifest on a live site

I changed my cache.manifest file to the following:
CACHE MANIFEST
NETWORK:
*
CACHE:
FALLBACK:
This triggered an update to my site.
I tried adding a "#" to the manifest file and then removed the manifest="cache.manifest"from my page.
This triggered the cache to be updated again, even though the reference to the manifest was gone. The console indicated it was still being loaded from cache.
I even tried renaming my cache.manifest file and it still was being loaded from cache.
How in the world can I safely stop using cache manifest? I have a completely new version of my site I want to roll out, but if cache manifest is still trying to cache the new site, that will be a disaster for all my visitors who do not know they need to clear their browser cache.
I believe I found the trick.
It seems I need to keep the old cache.manifest but change (not remove) the reference to a non-existent manifest file.
Presumably, I'll need to keep the broken reference there for several years in case I have visitors who only visit periodically.

HTML5 Cache -- Is it possible to have several distinct caches for a single URL?

Every URL can be linked to a single cache manifest. But I want several cache manifests linked to a same URL. Here is the reason:
Some files I want to be cached are rarely updated and large.
So everytime the cache gets updated these large files get re-downloaded even though they may not have been changed.
So I want to split up the cache. One cache for theses rarely updated large files and another cache for the often updated light files.
Do you guys have any idea how to split up an HTML5 cache?
The most efficient way is:
a) Use far-future expiration date (max-age) on all resources mentioned in manifest's CACHE section and add timestamp suffix to each file in the CACHE section, e.g.:
CACHE:
menu_1355817388000.js
toolbar_1355817389100.js
b) When any of the above files change on the server, regen/update manifest to change the timestamp. Only the file with the modified timestamp will get downloaded next time. Mission accomplished.
Note: Reload the page twice in the browser, as on the first refresh browser downloads just the manifest and uses old cached resources to paint the page. This is done to speed up displaying the page (there are tricks to handle this issue of double refresh, but they are outside the scope of your question)
See more info in this long but best article I ever seen on appcache.
Use an iframe
Your page's cache manifest would include the light files and the cache manifest of an iframe loaded by this page would include the large files
On chrome the iframe's application cache will also be used for the page. I didn't tested this method on other browsers yet.
see a live example at http://www.timer-tab.com and if you are using chrome see its split up cache at chrome://appcache-internals/
When the manifest file is changed and the files of the application cache are downloaded again, the normal HTTP caching rules still apply. This means that if you set the correct HTTP caching headers for these large files, you'll get a 304 so these files are not downloaded again. So it's not necessary to split the application cache.
Maybe an answer but I'd more like to shed some light on my findings as a I troubleshoot my own webapp.
I've discovered that I can use 2 iframes (manifest_framework) and (manifest_media) to load the manifests, but i'm still not exactly clear how they are targetted, but I had limited success.
manifest_framework:
CACHE MANIFEST
CACHE:
appdata.ini
dialog.png
jquery.min.js
login.htm
login.js
manifest.appcache.js
NETWORK:
*
FALLBACK:
manifest_media:
CACHE MANIFEST
CACHE:
manifest_fwk.php
od/audio_track_1_1.m4a
od/audio_track_1_2.m4a
od/audio_track_1_3.m4a
od/audio_track_1_4.m4a
od/video_1.mp4
od/video_2.mp4
od/video_3.mp4
NETWORK:
*
FALLBACK:
./ webapp.php
./index.php is the page the 'landing page' which itself isn't cached but falls back to webapp.php when offline.
What I don't understand is how these link to the webapp.php page.
I am finding I can only get access to one or the other manifests cache.
The above works in mobile safari, the media would be cached, and image but not necessarily the JS or images in the framework manifest.
Anyone have more examples where multiple manifests are referenced from the one URL/page?
The W3C working group has abandoned the file system api, so it SHOULD NOT BE USED anymore.
We'll likely see it fall off the next version of Chrome.
http://www.w3.org/TR/file-system-api/
CACHE MANIFEST
# This is a comment.
# Cache manifest version 0.0.1
# If you change the version number in this comment,
# the cache manifest is no longer byte-for-byte
# identical.
demoimages/mypic.jpg
demoimages/yourpic.jpg
demoimages/ourpic.jpg
sr/scroll.js
NETWORK:
# All URLs that start with the following lines
# are whitelisted.
# whitelisted items are needed to help the site function, you could put regularly
# changing items here
http://example.com/examplepath/
http://www.example.org/otherexamplepath/
CACHE:
# Additional items to cache.
demoimages/allpics.jpg
FALLBACK:
demoimages/currentImg.jpg images/stockImage.jpg`
If the Iframe trick does not work, use the HTML5 FileSystem API
See http://updates.html5rocks.com/2012/04/Taking-an-Entire-Page-Offline-using-the-HTML5-FileSystem-API

HTML5 manifest file

What is purpose of network section in the HTML5 manifest file? If I add a file in that section, doesnt it mean that the browser should not cache it, and it should be available only online?
I've added the file in Network section, but once I visit it online, it is always available offline. I have checked with FF5 and Chrome.
Here is my full manifest code, please see what is wrong with it?
Thanks.
CACHE MANIFEST
# cache files
CACHE:
index.html
offline.html
images/logo.jpg
# offline.html for all uncached pages
FALLBACK:
/ offline.html
# this should be available online only
NETWORK:
network.html
Obviously, it's a bug: http://code.google.com/p/chromium/issues/detail?id=91524
The manifest file allows for offline web applications where it 'caches' all the files listed in the manifest file and keeps them up to date for offline usage.
The NETWORK section in theory is the section to exclude * (everything) or a single file like you're trying to do with network.html. However, application caching with the manifest file doesn't rule out the 'old-fashion' caching mechanisms browsers have.
You've probably set some static content to be cache-able by the browser so depending on what server IIS/Apache you need to adjust your Expire / Cache-control settings.
Adding the file in the NETWORK Section, still saves file in cache and shows from the cache when I am online, whereas my expectation is it should ALWAYS fetch from online.
When I add "meta http-equiv="Pragma" content="no-cache" it always fetches that file from server
The pragma is a meta type that is primariyly used for IE. You might try setting the cache-control to no-cache add the pragma for IE and set the meta's for expire, public, store etc.. to control the page. At this point, creating the manifest file does cause browser caching to be enabled. You must add the mime-type text/cache-manifest and save the file with the extension .appache .
Example:
CACHE MANIFEST
# the above is a required line
# this is a comment
# spaces are ignored
# blank lines are ignoredCACHE:
/favicon.ico
index.cfm
# offline.html for all uncached pages
FALLBACK:
/ offline.html
# this should be available online only
NETWORK:
network.html
Best regards,
Link Worx Seo
I had similar problems.
Try configuring your server to expire content immediately in the HTTP Response Headers section like in the link above if the site is hosted on IIS.
If it is hosted on Apache you'll probably want to look at this.

HTML5 cache manifest and prefetching

One thing I'm not fully grasping is if the cache manifest is also acting as a prefetch when it is online for all the files listed.
For example, lets say I'm visiting:
/page1.html
Each of the pages on my site will have the same declaration:
<html manifest="/cache.manifest">
In the cache manifest file, I have:
CACHE MANIFEST
/page2.html
/page3.html
/page4.html
So what will happen is I visit /page1.html first, and when I'm online my browser will know to cache pages 2-4 also. And when I'm disconnected and I visit pages 2-4 everything will load just fine because it was already cached.
QUESTION: If I visit /page1.html, and I'm STILL connected online, and visit /page2.html, will my browser still request /page2.html, or will it not make another request to the server and use what it cached from the /cache.manifest file? Essentially acting like the prefetch link that firefox uses?
Well, the spec says "all files," without any exceptions for html files, so I figure it works for html files just like any other, it gets taken from the cache, not the server. However, I have not done any testing to confirm this. I would do the following:
Create the following cache manifest file:
CACHE MANIFEST
/page1.html
/page2.html
/page3.html
/page4.html
Include it in each of the four cache manifest files. Then:
Visit page1.html
Edit page2.html to make it different than before you visited page1.html
Visit page2.html
See which version you get.
Make sure you try it out on all browsers. I'll be interested to see your results.
When we use cache manifest it takes the files from the cache each time you load the page.
There is a solution for this.
You have to change the version number in the manifest file, If at all you have done any changes to the HTML files. so that your manifest pulls in the latest version of the HTML from the server and Stores it in Cache.
CACHE MANIFEST
#v01
/page1.html
/page2.html
/page3.html
/page4.html
You can just Increment the V01 to 02,03... So on, this will ensure your cache will have latest version of html pages
I think it takes it from the manifest file even if you are online :). Can't you try it out by uploading a file and then navigating to the page?

How can I get rid of the HTML5 offline cache?

I have an application which used to use the HTML5 offline cache. Now I've decided to not use it anymore and removed the manifest attribute from the index.html file. However, browsers still regard this site as cached and refuse to update the index.html file.
Even updating the manifest doesn't help. How can I remove the site from the user's offline caches? Am I stuck with a cached web site forever?
You need to make sure the manifest file isn't being cached, which by default it will be.
Adding
ExpiresActive On
ExpiresDefault "access"
To your .htaccess will stop everything being cached, though you really just want the manifest file to be cached in this way like this: (remember to update filename)
<Files cache.manifest>
ExpiresActive On
ExpiresDefault "access"
</Files>
You really need to do that first, but this should alleviate the problem.
I'd recommend reading through Mark Pilgrim's page on this as well.
Try changing contents of your manifest to simply CACHE MANIFEST with no files listed. The clients should retrieve the new manifest next time they hit the site and their cache should be removed.
Note however that they won't be using this new, empty manifest until they refresh the page.
I've found that in some cases on some browsers they don't necessarily grab the new manifest right away. This behavior seems inconsistent though. When this happens I tend to clear their caches / offline storage manually in order to force them to update (though I understand you can't necessarily get users to do this).