rendering behavior of remote linked files and script - html

lets say in my webpage i have added images, files & scripts which are not on available locally with respect to websites physical path
e.g
<script src"http://libraryheaven.com/somescript.js">
<link rel="stylesheet" type="text/css" href="http://www.styles.com/plugs/mystyle.css"/>
<img src="http://www.google.com/logo.png">
When the browser will start rendering the response HTML, then it will sort-of resolve dependencies, meaning it will then make separate HTTP requests to fetch the files from their remote location, or will it send request to the web-server of website to provide these and the that web-server will fetch these files and respond them to client or is the web-server intelligent enough to fetch and send all the dependencies.. please explain i haven't read the theory of rendering so i don't how it works...

When you enter URL in your web browser you tell browser to fetch whatever can be found at that particular URL. And in most cases it is HTML file or some server code which produces HTML on the fly.
When browser gets HTML it knows how to and tries to interpret it (that's it's primary task after all).
Now when interpreting HTML browser "meets" tags with src or href attributes it makes separate request per attribute to URL in attribute value. These URLs usually point to images, style sheets, javascript files. Browser fetches whatever it finds there and tries to interpret the downloaded resources as well (show image, apply style sheet, execute javascript).
So to answer your question:
Yes, browser will download all the resources by itself from URLs in afore-mentioned attributes
No, web server does not take care of any external references in a served/produced HTML
No, web server does not try to play smart here and does not try to give you more than you ask for.
So basically if you put something like this in HTML
<img src="http://www.google.com/logo.png" />
then you know that any browser interpreting this HTML will try to fetch image logo.png from google.

Related

Serve different resource depending on full URL of requesting page

Let's say that we have two pages:
https://www.example.com/first/firstpage.html
https://www.example.com/second/secondpage.html
that both load the resource https://www.example.com/resource.js
If I want the server that serves resource.js to be able to serve a different version of resource.js depending on which page the request is coming from, is there a reliable header upon which the full URL of the requesting page can be determined (or maybe there is some other way to determine this)?
I know that there is an Origin header, but from my understanding this just represents the domain (and any subdomains) without the full URL and query string. Is there any way for the server to know the full URL and query string that the request for the resource is coming from?
If this isn't possible, I know it would be easy to include that info in the JS script tag as follows:
<script src="/resource.js?origin=/first/firstpage.html"></script>
But I don't want to have to modify the script tag for each page. Is there some other way to have the page automatically include it's own URL in the query string of the resource request (without having to dynamically load the resource using my own JS script - HTML only please!), or just any unique identifier so that the script tag doesn't have to be modified individually on each page?
There's the Referer header that you can use.
Make sure that your response uses Vary: Referer. Otherwise, browsers are going to cache this resource as if the referring page URL didn't matter.
I'd plead of you not to do this at all though. You're going to create a rabbit hole of problems, as not all browsers or proxy servers are well behaved. Some are going to aggressively cache this anyway, no matter what you do with the Vary header.

How to edit HTML from a site you don't have access to source of?

In Chrome Dev Tools you can edit and make persistent changes to style elements.
https://developer.chrome.com/devtools/docs/workspaces
You can also edit any HTML from any site and preview it live, sort of editing any site including ones you don't own or have access to.
However, I want to persistently, for me at least, edit the HTML, not just the style elements. How can I do this?
More specifically, I want to change the URLs of the static resources as if they're on a CDN.
Now:
Request: http://www.targetsite.tld/
<html>
<img src="http://www.targetsite.tld/image1.jpg">
</html>
Goal:
Request: http://www.targetsite.tld/
<html>
<img src="http://testcdn.tld/targetsite.tld/image1.jpg">
</html>
Hosts file editing won't work as the initial request will then not resolve to the right server. I really want to load the document from the existing server, not save the entire source off somewhere, then edit that.
I've found this nodejs script but remain hopeful I could achieve something more simply on the client side within the browser.
http://www.deanmao.com/2012/08/28/modify-a-site-you-dont-own/
I probably need some kind of browser extension that allows me to tag certain dom element nodes, write some rewrites for them, save this profile and then reload the page.
Does something like this exist?
The answer is User Scripts. In particular, GreaseMonkey for FireFox and TamperMonkey for Chrome. These are browser add-ons/extensions which allow you to manipulate DOM elements on the pages you visit, using simple JavaScript to achieve your goals.
This route, I achieved my goal with one caveat:
The browser first parses the original HTML and hence then makes all the HTTP requests for the assets it finds on the original source page. Only then does the User Script manipulate the content. Any edits you make on-the-fly with your user script then gets loaded after the the original HTML. So in my case:
<img src="http://www.targetsite.tld/image1.jpg">
The original image gets requested from the original host. Then my user script in TamperMonkey manipulates the URLs, causing the browser to than also request my new img:
<img src="http://testcdn.tld/targetsite.tld/image1.jpg">
In other words, it doesn't so much replace the image, it duplicates the request, altering the second one. This, of course, has implications for performance measurements etc. So beware.

How can I hide the full url of my website?

When I upload my website files to my server and goto my website, i see the index.html at the url bar of the browser. How can I hide this?
http://bakpinar.com/about/about-us.html
I would like it to look like in this example;
http://www.royaltyline.com
as you can see, you only see the website address in the url bar of the browser. And when you click to another page, it doesnt show the .php, .asp or .html extension, just shows the folder name.
To hide the extension shown in the address bar, you have two options.
If you control the server, you can define rules that rewrite the URL based on the one the user is trying to get to. In PHP you can use the .htaccess file to define mod_rewrite rules. For similar features to .htaccess you can install the application request routing module in IIS 7 and above. In IIS (Windows) you can set up default pages that come up when users go to particular sites.
You can also make that all of your pages are accessed through the same page using AJAX, or put all the content on the same page and hide it using CSS and display it with CSS and/or JS.
This is a very high level answer, because the specifics vary greatly from situation to situation.
An easy way to do this, in case someone is still looking, is to use a full-screen iFrame. No matter where on the page your users are, they will always only see the main url. This used to be very popular back in the day, but it was a terrible practise in terms of usability.
<html><head>the stuff</head><body>
<iframe src="http://bakpinar.com/about/about-us.html" width=100% height=100%></iframe></body></html>
Write that into the index.html file at http://www.royaltyline.com
Yes, you can do by javascript.
<script>
window.history.replaceState('','','/');
</script>
It's not actually a folder name. It's rewritten URL.
To do such things you should redirect all requests to one file (index.php for example), then parse URL and basing on its parts, show particular file.
To redirect everything to index.php, use mod_rewrite module of Apache + .htaccess file.
To choose specific file you can implement one of several approaches. It's usually called routing in design patterns.
Completely other approach would be to use AJAX for reloading content. But it's not the way it was made on the website you gave as example.
In general there is a lot of information about routing urls in PHP on the web. Just do some research.
You are effectively looking to rewrite URLs. If your web server is Apache you will be able to use the rewriting module (mod_rewrite) to direct requests to http://bakpinar.com/about/ to http://bakpinar.com/about/about-us.html
If you are not running Apache, most web servers will serve index.html as the default page when requesting a directory, so renaming
about-us.html
to
index.html
and changing incoming links to
/about/about-us.html
to simply
/about/
Will give you the same results.

Is it secure to blindly trust image urls and output them into html img tags on a site? Can it be used to inject code?

I have to process a feed from a data provider, in this feed they provide us with image URL, currently we download them and store them in our own media server, but I was wondering if it was safe to simply get the url and output it directly in the html as the src attribute of an img tag.
My main concern is if this exposes us to the possibility of someone placing files under that URL which would could run malicious scripts/ do something other than render an image (or fail to render an image if it isn't one/doesn't exist, which is fine)
Will the img src attribute only render images, or will it download the file specified in the URL to the user's browser regardless of what it is?
I can verify at the import stage that the URL at least appears to be a valid image URL, so it would only ever have .jpg or whatever as an extension, but obviously this might still allow them to redirect to something else.
Image URLs can of course point to scripts (with some URL rewriting) but there's no risk to get a script run from an image load. URL data is treated as binary image data, not as runnable text/script.
If it's a script, for your browser it's nothing more than a corrupted image file.
So, no code injections risk. At least this is what I know.

Loading resources from html5 filesystem api

I am writing a chrome extension that dynamically writes some html pages and their resources to the file system. I have most things working but I just noticed that when I try to open one of the pages by navigating to the filesystem:chrome-extension://... url that I obtain via the fileentry.getURL() method, the page opens, but chrome does not fetch any of the associated resources: stylesheets, images etc. Any ideas why this might be? Are there some security flags I need to get this working? I am i going about this all wrong?
(One thing that may be relevant is that the resources are identified by relative urls. But I know they are correct relative to the file because if i manually resolve them and browse to the URLs I can fetch them.)
The page you include that uses the relative URLs doesn't understand the HTML5 filesystem's mapping. If you change the URLs to point to what the fileentry.getURL() calls give you, then this should work.
There's currently a bug that allows relative URLs in resources to be used like you're trying to do: http://crbug.com/89271