I am working on a Laravel application which is based on a social network. Images are stored on S3 bucket where pricing is based on the number of GET/PUT/DELETE.... requests. I want to reduced the number of request sent to the S3 buckets in any way.
Scenario: Imagine a facebook post and comments
A user's profile picture is being pulled from S3 bucket on a page load. In the comments section of a post a user has commented 10 times. I write a code as usual
<img src="https://s3-ap-southeast-1.amazonaws.com/somebucket/32431435696950423.jpg">
for each comment a new request is sent to the bucket? or by default the image is cached after the first request and pulled from the cache for the rest?
How do I achieve avoiding a multiple GET request for a single image?
It depends on browser implementation and your image Cache-Control header. Most of modern browsers support caching. They will cache your image if your image is allowed to be cached, and vice-versa. Check When multiple instances of same images are embedded in an HTML, does that load the image once? question.
AWS S3 can be configured to allow your objects being cached (read how to add cache control in AWS S3).
But, if your site has a high traffic, I suggest you to use AWS CloudFront instead of pure S3. It is a CDN (Content Delivery Network). It is faster and can be cheaper than normal S3.
"or by default the image is cached after the first request and pulled from the cache for the rest?" It is the correct answer only if image has the same source and the file name.
So 10 the same images from one URL will be downloaded once.
Related
I have a small django website that allows file uploading hosted on pythonanywhere. The files are uploaded to aws s3.
Now the problem is that even with the presence of download attribute in the html, the browser still renders the file instead of downloading it
<a download href="{{file_url}}">Download</a>
From Mozilla Developer Network Anchor element reference:
"....download only works for same-origin URLs, or the blob: and data: schemes..."
Your aws s3 links most probably are not same-origin (same domain etc) as your site.
If that is what you are running into, one work around that comes to my mind is, you can make on your site a transfer url that receives the document identifier, downloads from aws s3, and then forwards its content as response. This way you can also control content-type like headers that you may need to explicitly set to make the browser behave the way you want.
One addition, if you decide to have a solution like that, you have to take precautions to control the request content that transfer url, and only transfer content that your web site intended to. Otherwise you will have opened a vulnerability similar to what is called an "Open Redirect" vulnerability.
I have a server app that can serve files from it's local file system. However before it does that it should check to see whether the file could instead be served from CloudFront and, if so, redirect there instead.
Not all the files on the server are not necessarily duplicated in the S3 bucket origin(s) associated with the CloudFront distribution, so there will be some cases where a redirect to CloudFront is inappropriate.
How can I query the CloudFront SDK to find out whether a redirect to that would be successful (and not return a 404 for example)?
I appreciate that I could query the contents of an associated S3 bucket origin instead, but ideally I'd like to get the result from CloudFront so that it can do all of it's caching and failover between multiple origins and origin groups and I don't really want to have to replicate all of that logic in my code!
I happen to be using the c# SDK, but happy to accept answers in any language, it's more the principals behind it that I'm interested in. Am I perhaps thinking about this in the wrong way?
I was vastly overthinking this. Just making a http request with HEAD rather than GET and checking the response was 200 was all that was needed.
I'm using AWS S3 and Cloudfront to deliver images to my sites/domains.
I've been looking at AWS S3 CORS and I wanted to ask if I limit the domains will this restrict other domains from access my images.
If I was to set the following on a Bucket that contained images would this stop other domains from access the images within the bucket or do images operate differently to under resources under CORS.
<AllowedOrigin>http://www.example.com</AllowedOrigin>
Essentially I would like to restrict my images to my sites only.
Also I heard you must include the Cloudfront as another AllowedOrigin for this work - Can someone confirm this?
thanks
CORS is a policy enforced by a browser. Its not going to prevent users from downloading images from your cloudfront distribution.
You have two options.
Make all your files private and provide access via signed urls. Cloudfront wont really cache images in this case however.
The other option is to configure cloudfront to forward all headers, and use a bucket policy to limit access based on referrers. You can get around this, but it would prevent most casual hotlinking.
I've developed an iPad web app that uses the appcache. It's not intended to be a fully offline app but I use the appcache to store large image files so that they're not sent over 3G. Problem is when the manifest is updated the appcache updates whether the iPad is on wifi or 3G, which could be expensive.
Is it possible to have the user decide if the appcache can be updated or not? From what I've seen, this isn't possible, it all happens automatically, you just get events. But perhaps there's some trickery like writing the manifest on the fly or similar.
Using PHP on the server side if that helps. Thanks.
Connection Type: Theory & Future
There is a draft spec of Network Information API on W3C that provides the information of the connection type (ethernet wifi 2g 3g 4g etc.), but it hasn't been implemented on any browser yet apart from:
the stock Android browser on Android 2.2+ (not the Google Chrome browser)
navigator.connection.type // Based on W3C draft, (Implemented on stock Android browser)
and PhoneGap which is not exactly a browser
navigator.network.connection.type // on PhoneGap
Having that information in the future you could detect if the user has cellular data, then temporarily remove the src of the images and ask the user through a confirmation dialog.
You will also probably have to cancel the app cache update using:
window.applicationCache.abort() (documentation)
Reality
Unfortunately, the Net Info API is not available (at least not widespread) at the moment, but certainly will help in the future.
Long shot
There is a database that includes network speed (DIAL = dial up, DSL = broadband/cable, COMP = company/T1), but I haven't used it and I doubt it will help.
Dynamic App Cache
While checking into this, I tried to generate the html tag along with the manifest declaration on the fly, in order to combine it with the Network Info API but the AppCache manifest is loaded before javascript execution and is not affected afterwards.
So altering the manifest file on the fly through Javascript is not possible and data URI is not an option.
Alternative solution
HTML5 application cache is an untamed beast at the moment and there are talks to improve it. Until it changes to support more complex configurations (bandwidth level flag would be awesome), you could change perspective on the solution, although App Cache may be the best you have at the moment.
Depending on how big your images are you could rely on the normal browser cache. You could combine localStorage and far-future expiration HTTP headers. LocalStorage in order to keep track of the loaded/cached images.
First add a far in the future date for expiration on your images HTTP headers
On page load, remove all src from imgs
Loop the images and check localStorage if each image was loaded in the past
If there are images that were not loaded in the past, display a dialog confirming for the downloading of those images
If the image was loaded in the past, then put back the src on the img
For every image that is downloaded, save its URL on localStorage
I don't know what the status of indexedDB is on the iPad, but this could be an alternative solution.
In short: Indexeddb is a clientside database. Data is stored in object stores which are key/value pairs. The maximum storage capacity is in theory the maximum of your disk space. For more information about indexeddb:
Specification
My blog
What you could do with the indexeddb:
When someone navigates to a page:
Check every image tag if it is present in the indexeddb
if present
Get the image from the indexeddb and put it in the image tag
if not present
Download it
store it in the indexeddb
put the image in the image tag.
As extra (in the future) you can do as discribed by Sev: check the connetion type and only download the image when working on a fast internet connection.
I have 'invented' a working solution developing a webapp on the iPad (iOS 6.0.x) that may answer your question.
The idea is first to check if a localstorage variable is set/defined or not yet (I use the title of the page, thus the webapp name.)
If this localstorage variable exists, then assume (in webapp sandbox context) that its the first time the app is being run. At this point I populate a UUID in conjunction with $PHP_SESSION($uuid) to avoid 'cross app contamination' in server-side PHP land.
In addition to this I have a dynamic manifest.appcache.php which includes in the CACHE section a list of files to add to the manifest. Thus;
<?
echo $manifest_file_list[0]."\n";
?>
Using the JS appcache manifest event listeners I then monitor the progress to something like $('#manifestappcache').html(result);
I have an application manifest working nicely now to cache my app. However, I have one section that polls the server regularly and will render back different images depending on the state of the app. These images are not cached (it is not realistic to consider caching them), so they show as broken whenever that ajax call tries to draw new images on the screen.
Everything works fine when I have the appcaching off... how do I allow the app to look to the web for certain files instead of only looking at the cache?
You put those files in a NETWORK section in the manifest file. Anything in the network section will always be fetched from the network. Of course, you still have to set appropriate HTTP headers to prevent the browser cache storing those images, and any file in the NETWORK section will, by definition, be unavailable when the app is being used offline.