Can we prevent tampering with an offline HTML page or PWA? - html

Consider a system where we want to send someone a plain HTML+JS file and when loaded in a browser, it "executes" itself. (The inspiration is Portable Secret, which password-protects secrets for a file that can be shared offline, for a very convenient user experience).
The system has lots of flaws. One of them is that the HTML file could be modified while it's sitting around on the operating system, to do anything - for instance, you could tamper with it so when the password is supplied, it sends its secrets over the network to the attacker.
Now, we don't have this problem with most apps these days because they are signed. If you tamper with them, when the OS launches the app, it will (greatly simplifying) hash its contents, and notice that it no longer matches the signature. The signature can't be faked for the usual public-key crypto reasons, blah blah.
So, the question, finally: is there any equivalent anti-tampering standard we can use for an HTML page, stored offline?
I thought that maybe there would be something in Progressive Web Apps, perhaps putting a signature in the manifest, but I don't see anything immediately relevant. The behavior can't be anything defined in the HTML+JS file itself, obviously; it must be something the browser does automatically to check the contents. It might be acceptable if it has to do a network request to do it.

There are a few approaches you could take to try to prevent tampering of an HTML+JS file stored offline:
Sign the file: One approach you could take is to sign the file using
a private key and then include the signature in the file. When the
file is loaded in the browser, the browser could verify the signature
using the corresponding public key. This would prevent tampering with
the file because any changes to the file would invalidate the
signature.
Use a Content Security Policy: You could use a Content Security
Policy (CSP) to specify which sources are allowed to be used by the
HTML+JS file. This would prevent tampering with the file by blocking
any attempts to load external resources or execute malicious code.
Use a Service Worker: Another option is to use a Service Worker to
cache the HTML+JS file and serve it from the cache. This would
prevent tampering with the file because any changes to the file would
not be reflected in the cached version served by the Service Worker.
Ultimately, it's important to note that there is no foolproof way to prevent tampering with an HTML+JS file stored offline. It's always possible for an attacker to modify the file, so it's essential to be aware of this risk and take steps to mitigate it as much as possible.

Related

download html attribute does not rename the file using external URL

I am trying to rename a file when downloading it from <a> tag.
Here a simple example:
Download Stackoverflow Logo
As you can see, it never downloads the file with stackoverflow.png name, it does with default name though.
Nevertheless, if I download the image and tried to do the same with a local route, it renames the file properly.
Another example:
Download Stackoverflow Logo
The example above works properly.
Why download html attribute only works using local routes?
Thanks in advance!
The attribute download works only for same origin URLs.
By the way, you really should learn to use proper terminology, or else people won't understand you:
<a href="https://i.stack.imgur.com/440u9.png" download="stackoverflow.png"> is a tag, specifically, an opening tag;
download is an attribute;
stackoverflow.png is the value of the attribute;
https://i.stack.imgur.com/440u9.png is a URL, sometimes called an URI or an address.
The entire construction Download Stackoverflow Logo is an element.
A "route" is something else entirely, and has no relationship with HTML.
I couldn't find any info of it, but seems like external resources aren't allowed renaming.
Have a look here, there's an example linking to google image and that doesn't work either - seems like the specs have changed along the way.
This is a security measure applied to cross-origin download requests where the server hosting the download does not use HTTP headers to explicitly mark the file as being for download.
From the HTML specification:
If the algorithm reaches this step, then a download was begun from a
different origin than the resource being downloaded, and the origin
did not mark the file as suitable for downloading, and the download
was not initiated by the user. This could be because a download
attribute was used to trigger the download, or because the resource in
question is not of a type that the user agent supports.
This could be dangerous, because, for instance, a hostile server could
be trying to get a user to unknowingly download private information
and then re-upload it to the hostile server, by tricking the user into
thinking the data is from the hostile server.
Thus, it is in the user's interests that the user be somehow notified
that the resource in question comes from quite a different source, and
to prevent confusion, any suggested file name from the potentially
hostile interface origin should be ignored.

Obvious security flaws in redirect?

I have a web app that stores videos. I am using a java servlet (over https) which verifies a username and password. Once the details are verified, i redirect the user to a video stored in AWS S3. For those who don't know how S3 works, its just a web service that stores objects (basically think of it as storing files). It also uses https. Now obviously to make this work, the s3 object (file) is public. I've given it a random name full of numbers and letters.
So the servlet basically looks like this:
void doGet(request, response){
if (authenticateUser(request.getParameter("Username"), request.getParameter("Password")){
response.sendRedirect("https://s3.amazonaws.com/myBucket/xyz1234567.mp4");
}
}
This is obviously simplified but it gets the point across. Are there any very obvious security flaws here? The video tag will obviously have a source of something like https://www.mysite.com/getVideo?Username="Me"&Password="randomletters". At first blush it seems like it should be as secure as anything else assuming i give the file names sitting at AWS s3 sufficiently random names?
The obvious security flaw is that anybody could detect which URL the authentication servlet redirects to, and share this URL with all his friends, thus allowing anyone to access the resource directly, without going through the euthentication servlet.
Unfortunately, I don't know S3 at all, so I can't recommend any fix to the security problem.
All this mechanism does is provide a very limited obfuscation - using developer tools in most modern browsers (or a proxy such as Fiddler) a user will be able to watch the URL of the video that's being loaded and, if it's in a Public S3 bucket, then simply share the link.
With S3 your only real solution would be to secure the bucket and then either require that the user is logging in or use the temporary tokens for access [http://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html] ... though this does complicate your solution
I would also mention that including the password and username in plaintext on the link to the video asset (from the question above) is very insecure

Browse and post a path not the file

I've got a slightly unusual scenario. A web app running on a local network can perform various operations on any file on the network it can access. At present the user copies/pastes the UNC path to the file into a text input and clicks submit.
The server retrieves the file, performs some operations and returns the results to the user.
I'd like to allow the user to browse for the file using the webpage - but I don't want to upload the file, just get the full path to it. Is this possible?
I'm aware there will be a couple of scenarios which are doomed to failure - eg browsing to a local path not a UNC share but I can cover this with some validation. There will also be scenarios when the server can access a path the user can't (this is intentional) so browsing wouldn't work here.
All users will be techies who should get the point. Of course, if there were a way to limit the browse dialog to a UNC path, that would be even better but I suspect it's impossible.
Note, we already limit support to the latest versions of the main browsers and since this is just a utility feature, limited support is acceptable.
Sorry, that can't be done. It's a security feature.

How do I specify a wildcard in the HTML5 cache manifest to load all images in a directory?

I have a lot of images in a folder that are used in the application. When using the cache manifest it would be easier maintenance wise if I could specify a wild card to load all the images or files in a certain directory to be cached.
E.g.
CACHE MANIFEST
# 2011-11-3-v0.1.8
#--------------------------------
# Pages
#--------------------------------
../index.html
../edit.html
#--------------------------------
# JavaScript
#--------------------------------
../js/jquery.js
../js/main.js
#--------------------------------
# Images
#--------------------------------
../img/*.png
Can this be done? Have tried it in a few browsers with ../img/* as well but it doesn't seem to work.
It would be easier, but how's it going to work? The manifest file is something which is parsed and acted upon in the browser, which has no special knowledge of files on your server other than what you've told it. If the browser sees this:
../img/*.png
What is the first image the browser should request from the server? Let's start with these:
../img/1.png
../img/2.png
../img/3.png
../img/4.png
...
../img/2147483647.png
That's all the images that might exist with a numeric name, stopping semi-arbitrarily at 231-1. How many of those 2 billion files exist in your img directory? Do you really want a browser making all those requests only to get 2 billion 404s? For completeness the browser would probably also want to request all the zero-filled equivalents:
../img/01.png
../img/02.png
../img/03.png
../img/04.png
...
../img/001.png
../img/002.png
../img/003.png
../img/004.png
...
../img/0001.png
../img/0002.png
../img/0003.png
../img/0004.png
...
Now the browser's made more than 4 billion HTTP requests for files which mostly aren't there, and it's not yet even got on to letters or punctuation in constructing the possible filenames which might exist on the server. This is not a feasible way for the manifest file to work. The server is where the files in the img directory are known, so it's on the server that the list of files has to be constructed.
I don't think it works that way. You'll have to specify all of the images one by one, or have a simple PHP script to loop through the directory and output the file (with the correct text/cache-manifest header of course).
It would be a big security issue if browsers could request folder listings - that's why Tomcat turns that capability off by default now.
But, the browser could locate all matches to the wildcards referenced by the pages it caches. This approach would still be problematic (like, what about images not initially used but set dynamically by JavaScript, etc., and it would require that all cached items not only be downloaded but parsed as well).
If you are trying automate this process, instead of manually doing it. Use a script, or as I do I use manifestR. It will output your manifest/appcache file and all you have to do is copy and paste. I've used it successfully and usually only have to make a few changes.
Also, I recommend using the network header with the wild card:
NETWORK:
*
This allows all assets from other linked domains via JSON, for instance, to download into the cache. I believe that this is the only header where you can specify a wildcard. Like the others have said here, it's for security reasons.
The cache manifest is now deprecated and you should use HTML headers to control caching.
For example:
<meta http-equiv="Cache-control" content="public">
Public - may be cached in public shared caches.
Private - may only be cached in private cache.
No-Cache - may not be cached.
No-Store - may be cached but not archived.

secure images (gmail)

I was wondering how to keep images secure on my website. We have a site that requires login then then user can view thousands of different images all named after their ID in the database.
Even though you need to login to view the images the proper way...nothing is stopping a user from browsing through the images by typing <website-director>/image-folder/11232.jpg or something.
this is not the end of the world but definitely not ideal. I see that to stop this facebook just names the images something much more complicated + stores them in hashed folders.
Gmail does a very interesting thing, their image tags looks like this:
<img src=/mail/?attid=0.1&disp=emb&view=att&th=12d7d49120a940e5>
I thought the src attribute has to contain a reference to an image??...how does gmail get around this?
This is more for educational purposes at this point, as I think this gmail scheme might be overkill for our implementation.
Thanks for your feedback in advance,
Andrew
I thought the src attribute has to contain a reference to an image?
GMail is referencing an image. It's just being pulled dynamically, probably based off of that th=12d7d49120a940e5 string.
Try browsing to http://mail.google.com/mail/?attid=0.1&disp=emb&view=att&th=12d7d49120a940e5
Instead of it being a direct path to its location on the server's filesystem, it uses a dynamic script (the images may even be in a database, who knows).
Besides serving up an image dynamically from your webapp, it's also possible to use a webapp to dynamically authorize access to static resources that the webserver will serve -- commonly by putting the files somewhere that the webserver has access to, but not mapped to any public URI, and then using something like X-Sendfile (lighttpd, Apache with mod_sendfile, others), X-Accel-Redirect (nginx), X-Reproxy-File (Perlbal), etc. etc. Or with FastCGI you can configure an application in a FastCGI "authorizer" role rather than a content provider.
Any of these will let you check the image being authorized, and the user's session, and make whatever decision you need to, without tying up a proceses of your backend application for the entire time that the image is being sent to the client. It's not universally true, but usually a connection to the backend app represents a lot more resources being reserved than a connection to the webserver, so freeing them up ASAP is smart.
The code that runs after this GET request is issued:
/mail/?attid=0.1&disp=emb&view=att&th=12d7d49120a940e5
outputs an image to the browser. Something doesn't have to be named with a .jpg or .png or whatever ending to be considered an image by a browser. This is how captcha algorithms are able to serve up different images depending on a value in the id. For example, this link:
http://www.google.com/recaptcha/api/image?c=03AHJ_VusfT0XgPXYUae-4RQX2qJ98iyf_N-LjX3sAwm2tv1cxWGe8pkNqGghQKBbRjM9wQpI1lFM-gJnK0Q8G3Nirwkec-nY8Jqtl9rwEvVZ2EoPlwZrmjkHT7SM32cCE8PLYXWMpEOZr5Uo6cIXz1mWFsz5Qad1iwA
Serves up this image:
So the answer really is to just obfuscate your image names/links a bit like Facebook does so that people can't easily guess them.