I have my website hosted on S3 with CloudFront as a CDN, and I need these two URLs to behave the same and to serve the index.html file within the directory:
example.com/directory
example.com/directory/
The one with the / at the end incorrectly prompts the browser to download a zero byte file with a random hash for the name of the file. Without the slash it returns my 404 page.
How can I get both paths to deliver the index.html file within the directory?
If there's a way I'm "supposed" to do this, great! That's what I'm hoping for, but if not I'll probably try to use Lambda#Edge to do a redirect. I need that for some other situations anyway, so some instructions on how to do a 301 or 302 redirect from Lambda#Edge would be helpful too : )
Update (as per John Hanley's Comment)
curl -i https://www.example.com/directory/
HTTP/2 200
content-type: application/x-directory
content-length: 0
date: Sat, 12 Jan 2019 22:07:47 GMT
last-modified: Wed, 31 Jan 2018 00:44:16 GMT
etag: "[id]"
accept-ranges: bytes
server: AmazonS3
x-cache: Miss from cloudfront
via: 1.1 [id].cloudfront.net (CloudFront)
x-amz-cf-id: [id]
Update
CloudFront has one behavior set, forwarding http to https and sending the requests to S3. It also has a 404 error route under the errors tab.
S3 only offers automatic index documents when you've enabled and are using the web site hosting features of the bucket, by pointing to the bucket's website hosting endpoint, ${bucket}.s3-website.${region}.amazonaws.com rather than the generic REST endpoint of the bucket, ${bucket}.s3.amazonaws.com.
Web site endpoints and REST endpoints have numerous differences, including this one.
The reason you're seeing these 0-byte files for object keys ending in / is because you are creating folder objects in the bucket using the S3 console or another utility that actually creates the 0-byte objects. They aren't needed, once the folders have objects "in" them -- but they're the only way to display an empty folder in the S3 console, which displays an object named foo/ as a folder named foo, even if there are no other objects with a key prefix of foo/. It's part of the visual emulation of a folder hierarchy in the console, even though objects in S3 are never really "in" folders.
If for some reason you need to use the REST endpoint -- such as you don't want to make the bucket public -- then you need two Lambda#Edge triggers in CloudFront, to emulate this functionality fairly closely.
An Origin Request trigger can inspect and modify requests after the CloudFront cache is checked, before the request is sent to the origin. We use this to check for a path ending in / and append index.html if we find that.
An Origin Response trigger can inspect and potentially modify responses, before they are written into the CloudFront cache. The Origin Response trigger can also inspect the original request that preceded the request that generated the response. We use this to check whether the response is an error. If it is, and the original request does not appear to be for an index document or a file (specifically, after the final slash in the path, a "file" has at least one character, followed by a dot, followed by at least one more character -- and if so, that's probably a "file"). If it's neither one of those things, we redirect to the original path plus a final / that we append.
Origin Request and Origin Response triggers fire only on cache misses. When there is a cache hit, neither trigger fires, because they are on the origin side of CloudFront -- the back side of the cache. Requests that can be served from the cache are served from the cache, so the triggers are not invoked.
The following is a Lambda#Edge function written in Node.js 8.10. This one Lambda function modifies its behavior so that it it behaves as either origin request or origin response, depending on context. After publishing a version in Lambda, associate that version's ARN with the CloudFront Cache Behavior settings as both an Origin Request and an Origin Response trigger.
'use strict';
// combination origin-request, origin-response trigger to emulate the S3
// website hosting index document functionality, while using the REST
// endpoint for the bucket
// https://stackoverflow.com/a/54263794/1695906
const INDEX_DOCUMENT = 'index.html'; // do not prepend a slash to this value
const HTTP_REDIRECT_CODE = '302'; // or use 301 or another code if desired
const HTTP_REDIRECT_MESSAGE = 'Found';
exports.handler = (event, context, callback) => {
const cf = event.Records[0].cf;
if(cf.config.eventType === 'origin-request')
{
// if path ends with '/' then append INDEX_DOCUMENT before sending to S3
if(cf.request.uri.endsWith('/'))
{
cf.request.uri = cf.request.uri + INDEX_DOCUMENT;
}
// return control to CloudFront, to send request to S3, whether or not
// we modified it; if we did, the modified URI will be requested.
return callback(null, cf.request);
}
else if(cf.config.eventType === 'origin-response')
{
// is the response 403 or 404? If not, we will return it unchanged.
if(cf.response.status.match(/^40[34]$/))
{
// it's an error.
// we're handling a response, but Lambda#Edge can still see the attributes of the request that generated this response; so, we
// check whether this is a page that should be redirected with a trailing slash appended. If it doesn't look like an index
// document request, already, and it doesn't end in a slash, and doesn't look like a filename with an extension... we'll try that.
// This is essentially what the S3 web site endpoint does if you hit a nonexistent key, so that the browser requests
// the index with the correct relative path, except that S3 checks whether it will actually work. We are using heuristics,
// rather than checking the bucket, but checking is an alternative.
if(!cf.request.uri.endsWith('/' + INDEX_DOCUMENT) && // not a failed request for an index document
!cf.request.uri.endsWith('/') && // unlikely, unless this code is modified to pass other things through on the request side
!cf.request.uri.match(/[^\/]+\.[^\/]+$/)) // doesn't look like a filename with an extension
{
// add the original error to the response headers, for reference/troubleshooting
cf.response.headers['x-redirect-reason'] = [{ key: 'X-Redirect-Reason', value: cf.response.status + ' ' + cf.response.statusDescription }];
// set the redirect code
cf.response.status = HTTP_REDIRECT_CODE;
cf.response.statusDescription = HTTP_REDIRECT_MESSAGE;
// set the Location header with the modified URI
// just append the '/', not the "index.html" -- the next request will trigger
// this function again, and it will be added without appearing in the
// browser's address bar.
cf.response.headers['location'] = [{ key: 'Location', value: cf.request.uri + '/' }];
// not strictly necessary, since browsers don't display it, but remove the response body with the S3 error XML in it
cf.response.body = '';
}
}
// return control to CloudFront, with either the original response, or
// the modified response, if we modified it.
return callback(null, cf.response);
}
else // this is not intended as a viewer-side trigger. Throw an exception, visible only in the Lambda CloudWatch logs and a 502 to the browser.
{
return callback(`Lambda function is incorrectly configured; triggered on '${cf.config.eventType}' but expected 'origin-request' or 'origin-response'`);
}
};
The answers given are wrong. Cloudfront has its own configuration to have www.yourdomain.com/ serve up a document. It's called "default root object" and its config is found under the "general" tab of your cloudfront distribution. Here are the full steps for getting an SSL/https-enabled custom domain + cloudfront + s3 bucket.
Create a brand new S3 bucket with default (closed-off) permissions or remove all public access from the target bucket.
Disable static website hosting. You don't need it.
If you haven't already, get your SSL cert into Amazon so you can attach it to the cloudfront distribution which will be pointing to your S3 bucket.
Create a cloudfront distribution pointing to the target S3 bucket, utilizing the cert.
For the origin configuration, use the www.yourdomain.com.s3.amazonaws.com form for the origin, NOT the static website hosting URL (which should be disabled anyway).
Let the cloudfront config automatically change the S3 bucket access ("restrict bucket access"). You want access to the bucket restricted to this cloudfront distribution ONLY (via a specific identity). No one should be hitting your S3 bucket directly, especially since it can serve via http (no "s").
Under the cloudfront "general" tab (or during setup) set your default root object to "index.html" or whatever. Otherwise, requests to https://www.yourdomain.com/ will show permission denied.
Recently AWS has recently launched CloudFront Functions which can be used for this use case.
CloudFront Functions are cheaper, faster and easier to implement and test compared to Lambda#Edge.
Below is a sample function to attach index.html to the request if it is not provided while accessing the path.
function handler(event) {
var request = event.request;
var uri = request.uri;
// Check whether the URI is missing a file name.
if (uri.endsWith('/')) {
request.uri += 'index.html';
}
// Check whether the URI is missing a file extension.
else if (!uri.includes('.')) {
request.uri += '/index.html';
}
return request;
}
This will not append index.html in the web browser address bar, which gives a cleaner URL while browsing. In your case https://www.example.com/directory/ will remain as such while browsing, but will render the content of https://www.example.com/directory/index.html.
More samples can be found in https://github.com/aws-samples/amazon-cloudfront-functions/blob/main/url-rewrite-single-page-apps/index.js
This type of behavior is usually controlled/caused by your HTTP(s) header data, specifically, the Content-Type that your client receives.
Inspect the header and try tweaking what gets returned from your server. That should lead to your solution.
In Chrome, visit a URL, right click, select Inspect to open the developer tools.
Select Network tab.
Reload the page, select any HTTP request on the left panel, and the HTTP headers will be displayed on the right panel.
Related
Let's say we have a request to an S3 bucket to get an image:
<img src="https://s3-us-west-2.amazonaws.com/{BUCKET}/logo.png" />
I need to work on this project without having access to the internet, so within my Express server, I need to find a way to redirect all requests from https://s3-us-west-2.amazonaws.com/{BUCKET} to ~/Desktop/project/{BUCKET}.
Is there a way to do this via proxying, or would it be a better idea to cut a new branch and replace all external asset links with local file locations?
You would have something get something like this in your network panel
"https://s3-us-west-2.amazonaws.com/{BUCKET}/logo.png"
You can basically remove all "http://s3-us-west-2.amazonaws.com"
And lets say you run it on localhost:3000, your request will look like http://localhost:3000/{BUCKET}/logo.png
You can add following lines in your express server.
var request = require('request');
var proxy = true //if running locally else false
app.get('/{BUCKET}/logo.png', function (req,res) {
if (proxy)
res.sendFile('/home/Desktop/project/' + req.url)
else {
var options = {url : 'http://s3-us-west-2.amazonaws.com' +req.url,
method: 'GET'};
req.pipe(request(options)).pipe(res);
}
)}
The problem with this may be that for every asset it requests from s3 it will always go through your express server and the load of getting the assets will come on express server. You can do it for development but it is not recommended for production.
So for the final deploy you can put all the "http://s3-us-west-2.amazonaws.com" again.
If you don't want to do it programmatically, you can use proxy tools like charles or fiddler. They capture all the traffic from your system. You can create rules for particular requests or set of requests to fetch from local instead of s3.
function Hello($scope, $http) {
$http.get('http://localhost/api/Country')
.success(function(data, status) {
$scope.greeting = data;
}).error(function(data, status){
alert('Error');
});
}
URL: Try pull the data from url, it shown me as a 0kb file. when i click that URL directly in shown some data.
Tried on my application before you replace the url by localhost (I guess changed for security reason), seems to come from a wrong configuration on server side, not angular.
Firefox trigger an error saying :
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at http://localhost/api/Country. This can be fixed by moving the resource to the same domain or enabling CORS.
If you are in charge of this server, you should give a look at Cross-Origin Request, but your angular code is correct, sorry :D
I'm trying to get dojo to show Json data that comes from a remote web service. I need to be clear though - the web server hosting the html/dojo page I access isn't the same server as the one that's running the web service that returns the json data - the web service server just can't serve html pages reliably (don't ask!!).
As a test I move the page into the same web server as the web service and the below works. As soon as I move it back so that the html/dojo is served from Apache (//myhost.nodomain:82 say) and the web service sending the json is "{target:http://myhost.nodomain:8181}", then it stops working.
I've used FFox to look at the network & I see the web service being called ok, the json data is returned too & looks correct (I know it is from the previous test), but the fields are no longer set. I've tried this with DataGrid and the plain page below with the same effects.
Am I tripping up over something obvious???
Thanks
require([
"dojo/store/JsonRest",
"dojo/store/Memory",
"dojo/store/Cache",
"dojox/grid/DataGrid",
"dojo/data/ObjectStore",
"dojo/query",
"dojo/domReady!"
],
function(JsonRest, Memory, Cache, DataGrid, ObjectStore, query) {
var myStore, dataStore, grid;
myStore = JsonRest(
{
target: "http://localhost:8181/ws/job/definition/",
idProperty: "JOB_NAME"
}
);
myStore.query("JOB00001"
).then(function(results) {
var theJobDef = results[0];
dojo.byId("JOB_NAME").innerHTML = theJobDef.JOB_NAME;
dojo.byId("SCHEDULED_DAYS").innerHTML = theJobDef.SCHEDULED_DAYS;
});
}
);
Its true what Frans said about the cross domain restriction but dojo has this link to work around the problem.
require(["dojo/request/iframe"], function(iframe){
iframe("something.xml", {
handleAs: "json"
}).then(function(xmldoc){
// Do something with the XML document
}, function(err){
// Handle the error condition
});
// Progress events are not supported using the iframe provider
});
you can simply use this and the returned data can be inserted into a store and then into the grid.
Are you familiar with the Same Origin Policy:
http://en.wikipedia.org/wiki/Same-origin_policy
Basically it restricts websites to do AJAX requests to other domains than the html page was loaded from. Common solutions to overcome this are CORS and JSON-P. However, remember that these restrictions are made for security reasons.
Question I think is self explanatory, but if you need more, here it is:
Chrome Extension A saves an email address in localstorage.
Chrome Extension B wants to see that email address.
Is this permitted? (This might be more of an HTML5 thing than a Chrome-specific thing, but my knowledge is limited so I'll frame it within the context of my desire to know the answer).
If you own the two extensions, for instance, your the one maintaining both extensions. You can definitely use cross extension message communication to pass that email or even localStorage to the other extension.
For example, take a look at my extension here:
https://github.com/mohamedmansour/reload-all-tabs-extension/tree/v2
One extension is the core, and the other one is just the browser action (right now they are merged as of v3) but v2 lets them both communicate to each other. The browser action sends a "ping" event, and the core extension listens on such event and returns a "pong". The browser action extension is an "Add-On" to the core extension. When you open up "Options", it uses the options from the core one.
Back to your questions ... To access localStorage cross extensions, you can do something like this:
main core extension:
localStorage['foo'] = 'bar';
var secondary_extension_id = 'pecaecnbopekjflcoeeiogjaogdjdpoe';
chrome.extension.onRequestExternal.addListener(
function(request, sender, response) {
// Verify the request is coming from the Add-On.
if (sender.id != secondary_extension_id)
return;
// Handle the request.
if (request.getLocalStorage) {
response({result: localStorage});
} else {
response({}); // Snub them.
}
}
);
secondary extension:
var main_extension_id = 'gighmmpiobklfepjocnamgkkbiglidom'
chrome.extension.sendRequest(main_extension_id, {getLocalStorage: 1},
function (response) {
var storage = response.result;
alert(storage['foo']); // This should print out 'bar'.
}
);
BTW, I really didn't test this extension. I just copied and pasted from the reload all tabs extension that did something similar.
Not directly, but you can send messages between extensions. So if an extension that stores emails is expecting a request from some external extension, it could read the required data and send it back. More about it here.
Is possible to intercept 404 error without using web server (browsing html file in the filesystem) ?
I tried with some javascript, using an hidden iframe that preload the destination page and check for the result and then trigger a custom error or redirect to the correct page.
This work fine but is not good on perfomance.
A 404 error is an HTTP status response. So unless you are trying to retrieve this file using an HTTP request/response, you can't have a genuine 404 error. You can only mimic one in something like the way you suggest. Any "standard" way of handling a 404 error is dependent on your flavour of web server anyway...
404 is a HTTP response code, and as such only delivered through the HTTP protocol by servers that speak it. The file:// extension isn't a real protocol response as such, it's a hack built into clients (like browsers) that enable local file support, however it's up to browsers / clients themselves whether they expose any response codes from their file:// implementation. In theory they could report them in the DOM, for example, but they would be response codes exposed to themselves, and as such rarely implemented. Most don't, and there isn't a standard way for it. You may look into browser extensions, like Firefox, and see if they support it, but then, this is highly unstandard and will likely break if you pop it on the web.
Why don't you want to use the server?
I don't believe that it's possible to handle a 404 error client-side, because a 404 error is server-side.
Whenever you load a webpage, you make a request to the server. Thus, when you ask for a file that's not there, it's the server that handles the error. Regular HTML/CSS/JavaScript only come into the picture when the server sends back a response to tell you that it can't find the file.
Steve
Because I was looking for this today. You can now do this without a server by using a Service Worker to cache the custom 404 page, and then serve it when a fetch request status is 404. Following the instructions on the google cache lab, the worker files looks as follows:
const filesToCache = [
'/',
'404.html'
];
const staticCacheName = 'pages-cache-v1';
self.addEventListener('install', event => {
console.log('Attempting to install service worker and cache static assets');
event.waitUntil(
caches.open(staticCacheName).then(cache => {
return cache.addAll(filesToCache);
});
);
});
self.addEventListener('fetch', event => {
console.log('Fetch event for ', event.request.url);
event.respondWith(
caches.match(event.request).then(response => {
if (response) {
console.log('Found ', event.request.url, ' in cache');
return response;
}
console.log('Network request for ', event.request.url);
return fetch(event.request).then(response => {
console.log('response.status:', response.status);
// fetch request returned 404, serve custom 404 page
if (response.status === 404) {
return caches.match('404.html');
}
});
});
);
});