Prevent Browser File Caching - html

I need to disable caching for single files in all browsers.
I have a website that generates small video clips. There is a preview stage where the results can be watched.
An mp4 called preview.mp4 is displayed. When the user goes back, edits the video and wants to preview it again, the old preview.mp4 is being displayed even though the file on the server is changed.
How can I prevent the caching of this video file? Or what are the other ways to fix it?
Note: it's a single page application so I don't reload any HTML files. Only PHP content. Hence the headers I set, are not useful in this scenario:
<meta http-equiv="X-UA-Compatible" content="IE=Edge"/>
<meta http-equiv="cache-control" content="no-store" />
Thanks.

It's not the web page itself you should be concerned about, but the mp4 file which is downloaded and cached separately.
Ensure the response headers of the mp4 file prevent browser caching.
Cache-Control: no-cache

Hence the headers I set, are not useful in this scenario:
<meta http-equiv="cache-control" content="no-store" />
There problem here is that those are not headers.
They are HTML elements (which are part of the HTML document, which is the body of the HTTP response) which attempt to be equivalent to HTTP headers … and fail.
You need to set real HTTP headers, and you need to send them with the video file (rather than the HTML document).
The specifics of how you do that will depend on the HTTP server you use.
Further reading:
Caching in the HTTP specification
Caching Tutorial for Web Authors and Webmasters
Caching Guide in the Apache HTTPD manual
How to Modify the Cache-Control HTTP Header When You Use IIS

Try adding a cache key
preview.mp4?cachekey=randNum
Where randNum can be a timestamp or you use a random number generator to generate randNum.

Related

Preventing image files being added to browser cache

My application presents image files to the user (for photographic competition judging). It may present several thousand quite large files during a single session.
To present each image, I obtain a URL via a webservice using AJAX and then cause it to be displayed with
$("#imgImage").prop('src', resp.URL);
I am concerned about the storage usage within the user's browser. Will each image be added to the cache and if so, how can I prevent it?
I have the meta directives
<meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
<meta http-equiv="Expires" content="-1" />
but as the page itself is not reloaded each time, I'm not sure if they are effective.
You need to have the server send those headers with the HTTP response providing the images. In your page, they only apply to your page, not the images.
Beware that if your page ever shows the same image again (can the user go back? I'd want to be able to), the client will have to re-download it. If it's large, or they're on a metered connection, that may not be ideal.

how to prevent server I don't own from sending charset=UTF-8 in the http request

I have an old web site in French tha I want to preserve and whose html files were encoded in iso-8859-1. All html files included
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
in the <head> element, however the host of my website changed something in the configuration an now pages are sent from their server with an HTTP header including
content-type: text/html; charset=UTF-8
and unfortunately someone decided this would override the <meta> information.
Do I have to trans-code all my html files to UTF-8 or is there a faster solution?
Update
In fact the charset was added to the http header's content-type field only for html content issued by php, not for pure html files. I'll put the solution I adopted as an answer.
Your options:
Transcode the files
Persuade whomever changed the server configuration to change it again
Change servers
Run all every request through a server side script which outputs a different Content-Type header and then outputs the HTML (which accounting for cache-control headers)
Took me a while to realize the problem occurs only for .php files. The fix I chose is the following: I added the line
ini_set('default_charset', NULL);
at the beginning of every php files. A bit tedious but seems reasonable to me.

.html Caching in HTML

I have a web application published on IIS. All of my JS files are called from my static html file called "Index.html". In that html file, I call each JS file with the <script> tag, and in order to manage our versions and perform updates without user's history and cache deleting, I've added the ?v={version} at the end of each JS file's URL as the following:
<script src="./app.js?v=20161226.1" />
After multiple version updates, I've noticed that the users still need to refresh the page in order to get the latest Index.html file. After searching the Developer Tools of chrome, and looking in the Network section in the Developer Tools, I've managed to notice that the Index.html file is loaded from the cache (shown the "(from cache)" sign in the Network).
After searching the web for any solution for uncaching .html files (Because there is no ?v={version} for my .html file), I've found that adding:
<meta http-equiv="cache-control" content="no-cache" />
<meta http-equiv="expires" content="0" />
<meta http-equiv="pragma" content="no-cache" />
isn't solving the issue and the my Index.html file is still loaded from the cache.
I'm updaing my web application each two weeks and I can't afford myself letting the users deleting the cache and history each version update because the new and latest .html file is loaded because it is cached.
The only thing that helps is refreshing (F5) and then the Index.html file is reloaded (Not loaded from cache and the latest version of that Index.html file is shown). But if someone types the url and enters it in the URL-bar, the Index.html is still loaded from the cache.
Is there anything I've done wrong and should add anything else?
Is there anything to do to solve this issue at all?
Thanks!
Putting a query string on the end of a URL is a (good) hack to allow you to set the HTTP cache control headers to cache for a long time for infrequently changed resources and still force the new version to load on those occasions that you do change it.
If you are frequently updating your HTML, then just set the cache control headers to tell the browser to check for updates more frequently. Take advantage of Etags or If-Modified-Since instead of depending on an Expires header set far in the future.
NB: You have to use real HTTP headers. <meta http-equiv> is a bad joke.

Why use meta tag "Pragma" and "Expires" in head section of html

Why use meta tag "Pragma" and "Expires" in head section of html like this.
Thanks.
<META HTTP-EQUIV="Pragma" CONTENT="no-cache">
<META HTTP-EQUIV="Expires" CONTENT="-1">
Using this will disable the browser to cache your webpage.
Disabling cache has some valuable advantages.
Like when you update your files on the server, if happened that the browser doesnt have a cached copy of your webpage then it would force itself to load the updated content of your website.
One of the disadvantage of this is the impact on page downloading. Since you dont have cached copy on your browser, it will always download all of your assets from the server thus affecting time and also consuming bandwidth.
Try reading this article.
Both tags are meant to prevent browsers from caching the HTML page, and they usually do that. This means that access to the page may be slower especially if it is frequently visited. Probably most commonly, these tags are inserted by people who do not understand how caches work. See Caching Tutorial for Web Authors and Webmasters.
There are several ways to try to prevent caching. These specific tags have no official definition, and they do not conform to HTML5 CR.

html5 meta tag cache-control no longer valid?

How do I define
<meta http-equiv="cache-control" content="no-cache" />
in HTML5? It is no longer valid according to the W3C Validator and the documentation.
Putting caching instructions into meta tags is not a good idea, because although browsers may read them, proxies won't. For that reason, they are invalid and you should send caching instructions as real HTTP headers.
In the beginning of code you need to use this:
<!DOCTYPE html>
<html manifest="cache.manifest">
...
Then create cache.manifest file with content of what you want to cache i.e
CACHE MANIFEST
# 2010-06-18:v2
# Explicitly cached 'master entries'.
CACHE:
/favicon.ico
index.html
stylesheet.css
images/logo.png
scripts/main.js
# Resources that require the user to be online.
NETWORK:
*
# static.html will be served if main.py is inaccessible
# offline.jpg will be served in place of all images in images/large/
# offline.html will be served in place of all other .html files
FALLBACK:
/main.py /static.html
images/large/ images/offline.jpg
A manifest can have three distinct sections: CACHE, NETWORK, and FALLBACK.
CACHE:
This is the default section for entries. Files listed under this header (or immediately after the CACHE MANIFEST) will be explicitly cached after they're downloaded for the first time.
NETWORK:
Files listed in this section may come from the network if they aren't in the cache, otherwise the network isn't used, even if the user is online. You can white-list specific URLs here, or simply "", which allows all URLs. Most sites need "".
FALLBACK:
An optional section specifying fallback pages if a resource is inaccessible. The first URI is the resource, the second is the fallback used if the network request fails or errors. Both URIs must from the same origin as the manifest file. You can capture specific URLs but also URL prefixes. "images/large/" will capture failures from URLs such as "images/large/whatever/img.jpg".
There is no HTML solution. Mozilla's application cache (cache.manifest) is deprecated. The application cache site says:
This feature has been removed from the Web standards. Though some browsers may still support it, it is in the process of being dropped. Avoid using it and update existing code if possible. ... Use Service Workers instead.
Apart from that, I suggest you use HTTP Cache-Control to solve cache issues.
There isn't an HTML solution, because it's not a markup problem. Caching is an action on the resource, not part of the resource definition itself.
As others have said, HTTP headers are the best way to control caches, because these are observed by all caches - <meta> tags are only observed by browser caches. These should be set by your server / web framework.
That said, I wouldn't be surprised if browsers still observe <meta http-equiv="cache-control" content="no-cache"> for pages with the HTML5 doctype.