HTML5 webapp design: app cache - html

I'm developing a mobile web app, and I'd like to take advantage of the new HMTL5 caching features. The app consists in a photo manager: the user can create albums, store photos, edit pictures and data, and so on. I use the jQuery Mobile framework and all data is stored client-side (webstorage) apart from images, which are uploaded to the server.
I have not added the HTML5 caching yet, but I rely on the normal browser caching for images, and when the user edits an image and this is uploaded to the server, I change the querystring attached to the image request so I get the updated version (a trick I came to know here on stackoverflow).
I'd like to use HTML5 caching for everything, except for images, since this trick works like a charm, but I understand that once I add HMTL5 caching, every resource is:
either cached and not updated until a new manifest is detected / I do it programmatically (and I can't choose which resource to update) (CACHE section)
or not cached at all and reloaded everytime (NETWORK section)
Is there a way to have the cake and eat it too? :-)
Thank you very much.

Not every resource is cached once you start caching, depends on what is specified in your manifest file, so you could try to take out from the manifest the image urls you do not want to get cached.

Related

PDF in Object tag does not read latest version

I am hosting a website where a couple of PDF-files which are currently in Object-tags are updated weekly.
The name of these PDF-files stay the same, but the data changes.
Currently I'm using:
<object id="men" data="seasons/S2223/Men2023.pdf?" type="application/pdf" width="100%" height="750px">
<p>The file could not be read in the browser
Click here to download
</p>
</object>
When I update the PDF I'm expecting the
data="seasons/S2223/Men2023.pdf?"
to be reading the latest PDF however it stays the same as before.
I added the ? at the end of the filename which should check for the latest version but it doesn't seem to work.
When I clear my browser's cache it's updated but ofcourse this isn't a suitable option for users.
All help is appreciated.
Caching, in this context, is where the browser has loaded the data from a URL in the past and still has a local copy of it. To speed things up and save bandwidth, it uses its local copy instead of asking the server for a fresh copy.
If you want the browser to fetch a fresh copy then you need to do something to make it think that the copy it has in the cache is no good.
Cache Busting Query String
You are trying to use this approach, but it isn't really suitable for your needs and your implementation is broken.
This technique is designed for resources that change infrequently and unpredictable such as the stylesheet for a website. (Since your resources change weekly, this isn't a good option for you.)
It works by changing the URL to the resource whenever the resource changes. This means that the URL doesn't match the one the browser has cached data for. Since the browser doesn't know about the new URL it has to ask for it fresh.
Since you have hardcoded the query to n=1, it never changes which defeats the object.
Common approaches are to set the value of the query to a time stamp or a checksum of the file. (This is usually done with the website's build tool as part of the deployment process.)
Cache Control Headers
HTTP provides mechanisms to tell the browser when it should get a new copy. There are a variety of headers and I encourage you to read this Caching Tutorial for Web Authors and Webmasters as it covers the topic well.
Since your documents expire weekly, I think the best approach for you would be to set an Expires header on the HTTP resource for the PDF's URL.
You could programmatically set it to (for example) one hour after the time a new version is expected to be uploaded.
How you go about this would depend on the HTTP server and/or server-side programming capabilities of the host where you deploy the PDF.

Html Application Cache - Check if empty

We are using a "manifest.appcache" file to control the application cache on our site. A part of the application should be accessible offline, which means that some of the pages have the reference on the manifest in the html-tag, others don't.
Is there any way to check if the cache is empty (from all pages)?
Example
Page A is available online only, so no manifest is referenced. Page B is available online and offline, so the manifest is referenced. Now we want to check on page A (online only) if page B is already cached (the cache is not empty).
There is no way to check if the cache is empty or if there is an page in the cache (with application cache).
If you really need to do it, there are two solutions you can use:
You can use cookies to track the sites that are loaded in the cache. This opens new problems: What if the user clears the cookies and not the application cache?
The more cleaner solution is to use IndexedDB or the HTML5 FileSystem to save the content and just cache wrappers for that content. You can cache the wrappers to the beginning and then you can handle the content with the APIs i mentioned (i used this one). This way you can simply check whether a page is in the cache or not.
Sorry for answering my own question, but i hope this saves some time.

How to cache pictures in html5 AppCache

I'm building a photography site, which ideally should work offline. Caching the css/js files required is pretty straightforward. The issue is, what to do with the actual photographs.
I'm currently loading them from flickr, both a thumbnail version and a full res one. This brings me to two questions:
Is it possible to cache files that are delivered from an external source, or does everything have to come from the same domain?
It is probably too much to cache all pictures in the app-cache, because it will cause a big download the first time a user hits the site. What are the recommendations here? Is it possible to have the user explicitly turn on the full version of the app cache?
If you're not serving anything over https, you can include resources from as many origins as you want in the CACHE section. But cross-origin appcaching with https only works in Chrome today. I doubt Flickr serves images over https, so you should be fine.
Some browsers will prompt the user the first time an appcache would be downloaded, but not all (I know Firefox does, but Chrome doesn't). For more control you would have to implement some logic in your application. Maybe let users make a choice within your application, store it as a per-user setting and then only serve your page with a manifest to users who opted in.

What is the difference between HTML5 AppCache and the normal browser cache?

I don't understand the point of the HTML5 AppCache. We already have a normal cache. If you visit a website the first time it'll already cache all the assets. What extra value does the AppCache provide? Is it just a list of files so that the browser knows what assets to download, even if they're not referenced by the HTML right now? Does the browser make sure that the caching is "all-or-nothing", i.e. does it ensure that everything referenced by the manifest is cached, or nothing at all?
I think the point you're missing is that AppCache is specifically designed to allow web apps (and web sites) to be made available offline, though the same speed benefits which the normal browser cache provides, when the user is online, are also provided by AppCache.
The key difference with the browser cache is that you can specify all the assets the browser should cache in a manifest file (conceivably your entire site) whereas the browser cache will only store the pages (and associated assets) you have actually visited.
I'm no expert on the AppCache, but I do know it is not without its problems. There's a really good article here from a chap who used AppCache to allow parts of his mobile site to be available offline. It includes some rationale on their decision to use it and a number of gotchas they encountered in doing so.
This HTML5 Rocks article on the subject also has some good information.
AppCache actually uses the browser cache in support of its operation. It is the browser equivalent of downloading an app to run locally.
The first time a user visits the page, the resources of that page will be loaded from the server and stored in the normal cache. If the page specifies an appcache manifest, the browser will download the manifest and fetch all the resources in there (even if they do not appear on the page that embedded the manifest). These are then stored in appcache.
The second time a user visits the page, the browser will check its appcache. If an entry exists for that URL, it will load the page from appcache instead of from the server, based on the rules specified in the manifest (the manifest can mark some resources explicitly as fetched from the network).
After the browser loads the page from appcache, it will contact the server to see if there is an updated manifest. If the manifest is updated, it will fetch the resources from the manifest. These fetches are done using normal browser cache rules, so some of these resources may actually end up being fetched from the regular browser cache instead of from the server (this allows you to do differential updates when using appcache to develop offline apps). The new version of the appcache is kept separate from the old version. After the new version is fetched the user keeps interacting with the resources from the old version until they refresh the main page, after which the new version is loaded and the old is discarded.
Another important point is that appcache has different rules for when resources are discarded. Appcache basically never discards the latest set of resources, and caches them as a whole. To prevent abuse it enforces storage limits (sometimes as little as 5 MB) of how big a site's cache can be. By contrast, the browser cache has no per-site limits, but will discard individual resources from a site if the global cache limits are reached.
The important feature of HTML 5 application cache is it makes available the web application offline. Which is not given by normal browser cache.
In addition to this application cache will provide
Speed - since the entire contents of the specified page will be cached to browser so it will provide a better speed than browser cache
Reduce Server Load - There is no need of a post back all the time, since all the contents are there in cache, till there is any changes in the Manifest file
Cache manifest :- The cache manifest file is the heart of HTML5 application cache. We can specify what are the pages need not be cached, what should not, and even we can reuse this one as a error handling technique, for that we can specify a custom error page in the FALLBACK section to show if the user request a content that requires network connectivity
For a basic understanding on Application cache you can See this tutorial

Basic HTML5 caching

I am a little slow to the HTML5 caching, I have just some simple questions though.
1) How long is data in a caching manifest cached?
2) If I update the data, how can I make sure the client checks for a newer version when it is available, or is this already done?
3) Also, is this completely useless for a non-0mobile environment or can it speed up load times on a desktop?
<html lang="en" manifest="offline.manifest">
offline.manifest
CACHE MANIFEST
index.html
style.css
image.jpg
image-med.jpg
image-small.jpg
notre-dame.jpg
1) As long as the user cares to cache it. The only way to completely get rid of the cache is to go into the browser settings and explicitly remove it.
2) If you update the manifest file, the client will download new versions of all the files. This download is still governed by 'old' HTTP caching rules, so set headers appropriately, also make sure you send a 'no-cache' header on the manifest file itself. The rules from HTML5 Boilerplate are probably a good place to start.
3) Remember desktops can lose connectivity too. Also, having files in application cache means they are always served locally so, providing you're sensible about what you put in it, the application cache can reduce bandwidth and latency. What I mean by sensible is: if most visitors only see a couple of pages of your site and you update the manifest of your entire site every week, then they could end up using more bandwidth if you're forcing them to cache a load of static files for pages they never look at.
To really cut down on bandwidth and latency in your HTML5 website of the future: use the application cache for all your assets and a static framework; use something like mustache to render all your content from JSON; send that JSON over Web Sockets instead of HTTP, saving you ~800 bytes and a two way network handshake per request; cache data with Local Storage to save you fetching it again, and manage navigation with the History API.
1) How long is data in a caching manifest cached?
Once an application is cached, it remains cached until one of the following happens:
The user clears the browser's cache
The manifest file is modified
The application cache is programatically updated
2) If I update the data, how can I make sure the client checks for a newer version when it is available, or is this already done?
you can specify witch files not to cache (NETWORK:)
If you want to update your cached files, you should modify something in the manifest file, the best way is to put comment in the file and change it when you want the browser to update the cache
3) Also, is this completely useless for a non-mobile environment or can it speed up load times on a desktop?
Yes it is useful, cause the internet can cut on all devices