Treat no extension files as html? - html

So I'm recreating a website from web.archive.org. I've downloaded it and it has many pages. The problem is that the past site was a forum php script and now I obviously can't recreate it again. Nevertheless I will be satisfied with only being an html until I build something else.
So the problem now is that there are a lot of files generated from the query urls like this:
index.php#lang=fr
index.php#lang=fr&section=4
index.php#lang=fr&section=5
index.php#section=15&fonc=imp&lang=fr
etc...
And when I upload these files to my server the browser threats these no-extension files as text instead of an html, despite the html content inside.
Can anyone tell me why is this happening and is there an easy way to solve it?
EDIT: So apparently is the download software that I used which replaced the original urls ? with #. But if I just bulk rename all files from # to ? they still won't open. So how about the ultimate solution below, how to do that painless and fast?
Ultimately I would like to place each of the old files in one folder and rename them to html and then create htaccess rules from the original URLs to each file respectively in that folder. However doing this manually would take infinite time. So can anyone suggest a simpler solution to this?

This happens because your default content type is likely configured to be text/plain (which is the default in Apache). With HTTP, a resource type is not indicated by a file name extension, it is indicated by the Content-Type response header.
I think that you will have to set the default Content-Type header with this directive in your configuration:
DefaultType text/html
See also: http://httpd.apache.org/docs/2.2/mod/core.html#defaulttype

Related

Get .html filename of a website with Firebug

How do I find the filename of an website I am inspecting with Firebug? As example when I look on http://example.org/ I can view inspect the Element, I see the whole html structure but I didn`t find the filename. I am searching for index.html or something in that way. Maybe this is an analog question, but I am not sure, because he/she is working with php. LINK
I know there are some solutions with Dreamweaver or other tools but I am searching for an easy way to figure that out with Firebug or an free Browser Add-On. I Hope you have a solution for that.
The URL you entered is the one that usually returns the main HTML contents. Though on most pages nowadays the HTML is altered using JavaScript. Also, pages are very often dynamically generated on the server.
So, in most cases there is no static .html file.
For what it's worth, you can see all network requests and their responses within Firebug's Net panel.
Note that the URL path doesn't necessarily reflect a file path on the server's file system. It is depending on the server configuration, where a specific URL maps to in the file system. The simplest example is the index file that is automatically called when a domain is accessed. In the case of http://example.org the server automatically loads a file index.html in the file system, for example.
So, in order to get the file name on the file system, you need to either check the server configuration or the related access logs.

Why do people name their files index.html?

I have seen a lot of people been using that file name for their HTML files. I wonder why? I'm kind of new to HTML, I haven't learned much, but when I name my HTML files, I name them whatever I want. When I have been searching up examples of HTML, I have found they name it index.html. Why?
I have seen a lot of people been using that file name for their HTML files
You would typically use that name for one of your page, and it would usually be the home page.
When you arrive a website, for example www.website.com, you're not pointing to a file (like you would be if you typed www.website.com/about.html), you're pointing to a directory listing of all the files.
The webserver will try to serve a file, typically called index.html or index.php by default, but it could be something different, and it's configurable by editing your webserver's config files.
If the server doesn't find any file to serve (because you didn't include an index.html file or because you renamed it without editing the server's config) you will see a listing of the files, which is rarely the desired behavior, especially at the root of a website.
Generally the contents of index.html will be returned when just the directory is requested.
e.g. http://example.com/index.html is returned for a request for http://example.com
This is merely convention and is usually configurable.
Here is my take: It was likely named 'index' in the original internet because it is the 'indexing page' that directs to the sub pages, and you would go back to the index page to go to another page. This was before images and search engines. Later it got more advanced with a menu on all pages. This is how I remember it, but it's a long time ago.
https://twitter.com/PresidentUSW1/status/1442236777293496325?s=20
The default landing page of many Web servers defaults to index.html or default.htm and either way it's simply a start page. It's not necessary at all.

Opening directories in HTML

In some websites, I see links that look like this:
Link
The link doesn't go to an html file, but a folder (I believe). I was wondering if this has any purpose, and how to do this. Is there a default file to open when opening a directory? Because when I try something like this, I click the link, then I see a list of files in that folder, and I have to click on the proper link.
Everywhere I look, it says you should do links like this:
Link
Should I just let it go? I'm awfully curious.
This is something that is controlled by the web server. Some will look for a file called default.htm, others will look for index.html. It's usually configurable, and sometimes the server may look for any of a number of variations of index or default.
If such a file is not found, the server will often display a directory listing of all the files found in that folder, but usually that's not a good idea for security reasons. Again, this is something that can be controlled in the settings for the server.
Allowing directory listing is VERY dangerous and ill-advised practice. You should hide real directory structure of your site by all means.
PHPDL is a Php script that lists all the files in a directory (except itself of course). What sets PHPDL apart is that everything the script needs is in one file, including the file-type icons it uses.
Note: You can rename the script to anything you want. It will not list itself as a file to download.
This script safe and usefull, see demos:
http://greg-j.com/projects/phpdl/PHPDL-v2.php
http://greg-j.com/projects/phpdl/PHPDL-lite.php

html directory listing formatting

So, I've been trying to get a web page to display links to videos (over a symbolic link) dynamically (i.e., without hardcoding an <a></a> tag for each one) I have, and I think I may have found a solution, albeit a hacky one:
Video
Ignoring that this is a horrible way to do this, does anyone know how to format the following?:
I'm guessing there is an apache config file somewhere, but it is extremely hard to search for it as I do not know what it is called when files are just listed in this manner.
i'm basically looking to resize the widths of columns, and maybe even do some pretty-fication.
this is all running on my web/file server and is being accessed form my local machine.
This is what you're looking for:
http://perishablepress.com/better-default-directory-views-with-htaccess/
This tutorial details how directory listing by Apache can be modified to suit your taste using HTAccess file.
Using Apache HeaderName and ReadmeName directives and the module "mod_autoindex.c" you can add custom markup to your directory listing pages.
For displaying links to A/V and other files, look at my website: https://wrcraig.com/ApacheDirectoryDescriptions.
It goes beyond the default directory description, providing a spreadsheet to assist in creating detailed descriptions and exporting them in FancyIndex/AddDescription format for inclusion in .htaccess.
It also provides a menu driven BASH scripted alternative, using the FancyIndex descriptive data above (automatically adding A/V durations) to recursively populate a custom index.html while retaining the security features of .htaccess.
The site has examples of the input spreadsheet and both the FancyIndex output and the optional BASH scripted output.

HTML Include file

I have a basic web application packaged as an EAR deployed on GlassFish.
The web module has some html files.
The html files have a common footer, an html file, that I would like to extract out and make an include.
When I do, and put:
<!--#include virtual="insertthisfile.html" -->
in an html file, it does not work.
Should this work?
This is a technique called Server-Side Includes (SSI). It may not be enabled on your web host. If it is, sometimes they force a .shtml extension to be required for included files, so try renaming your file insertthisfile.shtml.
If that doesn't work, you might be able to enable SSIs in a .htaccess file (assuming your web server is Apache). You can find instructions on how to do this by googling. There's a decent set here.
If that fails, I would contact your web host and see if they have SSIs enabled.
Should this work?
Perhaps, at some special settings, with some experienced programmer, this could be useful.
In my case the include statement seems to be ignored.
I could include some text with
(embed src="include.shtml")
(/embed)
Above, I type () instead of angular brackets.
With the "embed", the setting in the header of the page does not apply to the included text; it should be repeated again, and, by default, the result is ugly.
It looks strange, as if the designers of the html did not build-in the very basic tool, the include command. For short articles, the include could save an order of magnitude in the size of files.