CloudFront Cache HTML - html

Can Amazon CloudFront be used to cache HTML pages, and no just image, css files, etc?
If not, is there a comparable service out there that does this? I.E., I overlay the service on a domain, and literally it only queries that site again, when the cached page has expired.
I looked at CloudFlare as well and they don't yet do this.

Yes, you can serve HTML through Cloudfront.
The main disadvantage is when you need to update the HTML as you are unable to version HTML for SEO reasons.
So by setting a cache into cloudfront of 1 hour for example it means that the page is kept into cloudfront for maximum 1 hour, and only after cloudfront will retake the HTML from your source and update it.
You can use invalidations on cloudfront to speed up the process but you need a full list of your html pages for a fast copy and paste into aws for invalidating.
Of course all this work for fixed webpages, that do not change for user.
You can apply it even to ASP / PHP only if the generated content is fixed between all users.
So you have the PHP into your origin , and cloudfront save the HTML of it.
My English is not the best one, so i hope i clear somethink...

Yes, you can serve HTML through CloudFront as long as you don't mind every user getting the same content until the cache expires.
I can't imagine a CDN that would not support this. They might not advertise it since many web sites are dynamic and can't be cached, but if your site is basically static, then any CDN should work.

Related

Attachment of external content - forcing although X-Frame-Option=SAMEORIGIN

I read more in the Internet, but I didn't managed to find solution to this problem:
Is it possible to attach some external content in case of sending X-Frame-Option=SAMEORIGIN by server ?
I know that <iframe> can't be used, however maybe there exists some another way.
Thanks in advance
No, it's not possible to show another page's contents within your website if they are setting the HTTP header X-Frame-Options: SAMEORIGIN. That header says that the page can only be embedded on pages on the same domain name.
However, if you are running your own server-side application (i.e. using PHP, Node.js, etc), you can scrape the website on your server, and then display whatever info you needed from the other site that way. It will be more work this way, and you probably won't be able to perfectly replicate how everything appeared on the source site, but it's the only route you've got. I suggest googling "scraping" + the name of your server-side language/environment to learn how to do this.

How to set an expiration date for an html link

I'm not sure if I'm being a complete noob at this (it's been a looooong night :D), but is it possible to cache links with .htaccess? I know that you can set extensions and stuff like jpg, png, css, js, etc.
And if you've ever hosted a website, I'm sure you've probably used one of those online "website optimizers", and I keep getting the message "The following cacheable resources have a short freshness lifetime. Specify an expiration at least one week in the future for the following resources:"
...followed by a list of outside links like Facebook and Google.
Any ideas?
You cannot alter the headers or the content for external resources like Google cdn or facebook. Assume that big companies like Google and Facebook know how to cache and what resources are viable to cache and for how long.
For resources on your own server, you can set the Cache-Control header with a custom time to tell the client for how long the page can be cached.
<FilesMatch \.(css|js)$>
Header set Cache-Control "public, no-transform, max-age=600"
</FilesMatch>
You can check how long it takes to load certain resources on your page by going to your browser and opening the developer console. Under the network tab you can see all requests that are being made. Make sure to load the page both with cache and without cache.

Exclude page self by appcache

I have an appcache (with NETWORK *). So now I visit my page with <html manifest="/cache.appcache">. Then the page itself is cached as all the images are. But I want the page self to not be cached. How can I do this? I thought NETWORK * would do the trick.
Regards,
Kevin
The appcache manifest always caches the master page.
If you are using Chrome check the cached files for your page here: chrome://appcache-internals
A workaround could be to put a hidden iframe somewhere on your page, which contains the appcache file to cache offline content. (take a look at "Preventing the application cache from storing masters with an iframe" here: http://labs.ft.com/2012/11/using-an-iframe-to-stop-app-cache-storing-masters/ )
A better solution could be to write your page to fetch new content from your server when it is opened - if the server cannot be reached, it can serve the last known content from the HTML5 local storage.
I have tried the iframe work around, and find it ripe with errors. Most browsers cache the data for the iframe where the page cannot get it.
Instead make the page's content load via AJAX. Basically have a blank html page with the manifest and javascript which pulls and adds its content from the server. This way only the blank html is cached, and content is always updated from the server.
Converting a page to this method can be very difficult, but it works. Making sure the appropriate javascript gets run at the correct time, probably requires some detangling. Moving around server code which won't be called when pulling from cache to the new ajax method.
Note: no need to pull conditional content from the server if the condition is in the query string, different query strings make a separate cache

Omit current page from HTML5 offline appcache but use cached resources

For performance purposes, I want to have some of my web pages use resources that have been cached for offline use (images, CSS, etc.) but to not have the page itself cached as the content will be generated dynamically.
One way to do this would be to refactor my pages so that they load the dynamic content via AJAX or by looking things up in LocalStorage. Details may vary, but broadly speaking, something like that.
If it's possible, I'd prefer to find a way to simply instruct the browser to use cached resources (again, images, CSS, etc.) for the page but to not actually cache the (dynamically generated) HTML content itself.
Is there a way to do that with HTML5 offline appcache? I'm under the impression that the answer is "no" because:
Any page that includes the manifest will be cached so I can't specify the cached resources in the page itself.
There is no way to tell a previous page "use offline assets for this other page but don't actually cache the HTML on that page". You have to specify the page itself, which means the HTML will be cached.
Am I wrong about that? It seems like there is probably some tricky (or not-so-tricky) way around that. Now that I've typed it out, I wonder if including the page explicitly in the NETWORK section of the appcache manifest will do the trick.
My answer is "yes".
I have worked on a web-app where I listed all the necessary resources in the manifest, and set the NETWORK section to *.
The manifest is then included only on the main landing page. So all resources are cached the first time you visit the site and and it works a treat.
In short,
one of your pages must include the manifest and will therefore be cached.
maybe you can have the manifest loaded in a iframe and not have the whole page cached, just a thought.
list all your resources to be cached in the CACHE section
set the NETWORK section to *
I'm fairly certain that the answer to this is no.
If you use the Network section in Chrome, then it shows which resources are loaded from the cache and which are loaded from the server. I have attempted to set the appcache as described above and the resources are always loaded from the server.
Would I be correct in assuming that if the current page is not in the appcache then it wont bother to check in the appcache for any of the resources?
What I've found that is working is to list those files that you don't want cached in appcache in the NETWORK: section of the manifest. For me, that meant adding *.asp* to the network section. Now, none of the classic asp files, or aspx files are cached.

How do I configure optimal cache policy for index.html page in my site?

I have a web site with an index.html homepage that is updated from time to time. We sometimes add offers for our clients, special messages and so on, which have to be visible by next day for everyone.
If index.html is cached by browsers, many users will not notice that anything has changed, unless they explicitly refresh the contents of the page...
Which is the best way to be sure that 100% visitors have an up-to-date index.html page, without compromising cache performance?
My best answer would be to skip out on updating the index.html each time and go with a server-side programming language, like PHP. You can then set the headers for the page to not cache, and you can also set up an admin page that you can use to change the content. Or you could go with a browser-side script with JavaScript using AJAX. Then the page has an ability to update before the next loading of the site.