wget --page-requisites locally? - html

I have a local HTML file referencing image and style data in various places in the local file system. I’d like to get a list of all referenced files; or alternatively a command that will copy the HTML files and all referenced files to some clear location (with or without changing the links in the HTML file), so that I can make a self-contained ZIP file of the HTML page.
It seems that wget provides good support for downloading an HTML file including all prerequisites (images, styles) using the --page-requisites flag. Unfortunately it does not support file:// URL.
What are my options here?

Why not setting up a local Apache server and serve it off localhost?
You can use EasyPHP, MAMP or other to set up easily a local Apache server.

Related

Mediawiki: Links to open local files on the server doesn't work

I have MediaWiki installed on a synology server. I would like to create a link on the wiki that would allow opening of files on the same server.
Here are the steps I did to achieve this:
Added $wgUrlProtocols[] = "file://"; in LocalSettings.php
A test file on the server: file://myServerName/path/to/file/test.txt. Putting this URL in my chrome browser directly opens the file.
Create a page in MediaWiki with a link to this file using [[file://myServerName/path/to/file/test.txt]]
When I click on the generated wiki page, nothing happens. However when I hover on top of the link, it shows the correct URL.
Can someone please point out what additional steps I need to do to get this working?
The file:// protocol points to the file on your computer. I'm not fully sure, but I think you cannot use it to retrieve file from a different machine (read my comment below about samba shares).
From quick research it looks like Chrome browser blocks requests with file:// protocol, But browsers like IE should allow you to open those files. It is done because of security reasons so the malicious site cannot open local files without your permission. You might bypass that by installing a special plugin in Chrome (look for Enable file links)
Instead of using file protocol, make those files available via Synology WebStation, and then create links that point to the file via webstation (not via path on the server). With that approach, links attached on your MediaWiki pages will work as those will be regular links.
If you don't use the WebStation, you might also try with ftp:// links (use the FTP service), or link to samba shares - that's where the file:// protocol might work, but again - I'm not sure and I cannot test it as I do not use windows.
I think that the safest/easiest/fastest way is to expose those files via WebStation.
Source: https://en.wikipedia.org/wiki/File_URI_scheme
The file URI scheme is a URI scheme defined in RFC 8089, typically used to retrieve files from within one's own computer.

How to display images from varying directories

I have a website that can have images in varying directories. I'm
running Linux and some of the images can be in /tmp/ while others in a directory that isn't within the codebase's one. So for example, I have:
/tmp/
/home/work/codebase/htmlfiles
/home/stuff/stuff/images
The code I'm using to try and access these directories is this:
<img src="' + path + image + '">;
Where path is the directory and image is the filename. Path does end
with /. Currently it will just give 404 errors even when I have
confirmed that there is such a file in that directory.
Am I missing something? Does HTML not allow you to navigate from the
root directory?
Your web server presents the files based from a web root directory.
So if your website is in /home/stuff/stuff the webserver does the following translation:
/index.html -> /home/stuff/stuff/index.html
/images/image1.png -> /home/stuff/stuff/image1.png
/tmp/ -> /home/stuff/stuff/tmp/
To do otherwise would be a massive security risk, allowing any online user to pull arbitrary files from your system.
There are a few possible solutions to this, what is best will depend on your situation.
You can map web paths to different paths on thy system
http://httpd.apache.org/docs/2.0/mod/mod_alias.html#alias
You can symlink the directories holding your images into the webroot. Ensure that you allow the webserver to follow symlinks.
https://superuser.com/questions/244245/how-do-i-get-apache-to-follow-symlinks
You can also hard link the files to exist in the webroot, you can use a serverside scripting language, or simply move the files.

How to create link in HTML that download that file

I have http://192.168.230.237:20080 Server
file located on "/etc/Jay/log/jay.txt"
I tried with "http://192.168.230.237:20080/etc/Jay/log/jay.txt" this link gives me "404 NOT Found"
Here I can I link my file to link
Your HTTP server will have a configuration option somewhere (Apache HTTPD calls it DocumentRoot) which determines where http://example.com/ maps onto the filesystem of the computer.
Commonly this will be /var/www/.
Unless you change it to / (which would expose your entire filesystem over HTTP and is very much not recommended), you can't access arbitrary files on the computer.
/etc/ is used to store configuration information for software installed on the computer. It should almost never be exposed outside the computer.
The best solution to your problem is probably:
Look at the configuration of your HTTP server and identify the document root (e.g. /var/www/)
Move your website files to that directory
If you really want to expose files under /etc via HTTP then you could also change the document root.
Your webserver might also support features like Apache HTTPD's Alias directive which allows you to map a URL onto a file that can be outside the DocumentRoot.

Set absolute path for root directory in HTML on local filesystem

How can I use absolute paths in my website while testing on my local filesystem? I know that I can use / to access the root directory of my website. However, this works only when my website is on the server. I want to be able to use absolute paths on my local filesystem so that I can do proper testing before uploading.
Is there a way to set a variable to a root directory in HTML? Something similar to Linux where you can define a variable WEBPATH=/home/user/website. Thus I can use e.g src="WEBPATH/folder/file.html for all the files I use in my website and I can modify WEBPATH depending on whether I am testing locally or using the server root folder.
I am open to other workarounds as well.
I'm assuming you're using a file url to access your HTML in the browser, in which case an easy way to get absolute paths working is by using a local webserver to serve your site.
If you have Python 3 installed, you can run python3 -m http.server from the command line at your web root, and it will serve your site at localhost:8000.

how to run html on python openshift server

I have an OpenShift server running python. However when I call php via SSL the php interpreter starts running. It suggests that there might be a way to run php as well. However, HTML if fair enough for me. Now, I do not know how to be able to reach html files on my server as when I am trying I always get 404 not found. I've read about a solution of placing a .htaccess file:
AddType application/x-httpd-php .html
I am not exactly sure where to place this file but placing in the folder of the .html file still not helps.
Could you please help me how I can make .html files reachable at an OpenShift server running Python? How about php?
Put the .html file in your app-root/repo/wsgi/static folder (or in that folder in your git repository). if you want it to be displayed like app-domain.rhcloud.com/file.html, you will have to use a .htaccess file in your wsgi folder that rewrites file.html to static/file.html