Website auto-detect SHTML pages - html

We just received a code for a website. Nearly all of the links are broken because they didn't add .shtml as a file extension to any of the links.
Is this due to them simply forgetting or is there some configuration where we would basically tell the website to automatically detect ".shtml" pages?
Thanks!

There are numerous ways for a server to map a URL onto a file based on a word in the URL with the addition of an extension (either a hard coded one or one provided in a list).
The obvious ones that spring to mind for Apache HTTP (your milage will vary with your web server) are multiviews and mod_rewrite.

Related

How to create a W3C-validation-link for local files?

I have some local (self generated) HTML files. I view them in browser via file:// without a local webserver.
I would like to have a link in the footer of each of this files to "Validate" them on the W3C Validator. On some websites I see the links like this in source
<p class="validation">Validate</p>
But of course this doesn't work because there is no referer about local files.
I was asking this on ProWebmaster but the question was out of scope there.
EDIT: File Upload to the validator website via its web formular is not an option. I would like to send the whole HTML source to the validator without any external tools.
Validate
You can't.
The was the referer link works is to tell the validation service to look at the Referer header to get the URL to the previous page. Then to request that page and validate it.
Even if there was a Referer header for your local file, there is no way for the validation service to access it. It would be a serious security problem if every website you visited could read files from your hard disk freely!
Use the file upload feature to validate files without a public URL.
Alternatively, consider using a local validation tool and possibly looking for an extension to your IDE that makes it more convenient to access (such as this extension for VS Code).

Specifying index.html in browser to load home page

If I want to load the homepage of https://medium.com/ by typing the exact index.html file address into my browser, how would I do that? Or is it not possible?
https://medium.com/index.html gives me a 404 error. Also curious how I would do this more broadly with any webpage for which my browser is displaying a url that does not end in .html.
Common static websites hosted just as files somewhere usually have an index.html document which can be resolved either directly or is normally loaded when no particular document is specified so https://example.com/ and https://example.com/index.html both work.
But this is not how most webs work. Pages can be dynamically generated server side, you just send a request to the server and if the path matches some server operation it will create a response for you. Unless https://example.com/ returns documents from a directory using something classic like the Apache Web Server set to serve static files from a directory, it won't work.
There is no general way to know what, if any, URLs for a given website resolve to duplicates of the homepage (or any other page).
Dynamically generated sites, in particular, tend not to have alternative URLs for pages.

AMP: why are files with .amp.html extensions not displayed on linux hosting?

I recently converted all the web pages of my website into amp. I rename them all in (.amp.html). I took care to test each page with the amp tester: https://ampbyexample.com/playground/
i also bought a domain name that points to https, a linux hosting at godaddy. Only here, when I send the files to the extensions (.amp.html) nothing is displayed on the domain name. By cons when I rename all files in (.html) simply, the website is displayed. My question is, why are files with .amp.html extensions not displayed?
The problem comes down to webserver configuration, and likely has two issues.
The first is that you're probably expecting a default document to appear when you don't request a specific one. For example, http://example.com/... the path here is just /, but a web server will commonly load index.html from disk. Chances are, your web server is not configured to load index.amp.html from disk.
The second issue may come down to a bad MIME type configuration. It's important that text/html; charset=utf-8 be sent as the Content-Type response header value for your HTML files.
If you have control over your webserver, you can reconfigure it yourself. You didn't tell us what server you're using, so we can't tell you specifically how to do that. If you don't have control over your webserver, you'll have to take it up with your hosting provider... GoDaddy. Or, just name things .html and you'll be fine!

Get .html filename of a website with Firebug

How do I find the filename of an website I am inspecting with Firebug? As example when I look on http://example.org/ I can view inspect the Element, I see the whole html structure but I didn`t find the filename. I am searching for index.html or something in that way. Maybe this is an analog question, but I am not sure, because he/she is working with php. LINK
I know there are some solutions with Dreamweaver or other tools but I am searching for an easy way to figure that out with Firebug or an free Browser Add-On. I Hope you have a solution for that.
The URL you entered is the one that usually returns the main HTML contents. Though on most pages nowadays the HTML is altered using JavaScript. Also, pages are very often dynamically generated on the server.
So, in most cases there is no static .html file.
For what it's worth, you can see all network requests and their responses within Firebug's Net panel.
Note that the URL path doesn't necessarily reflect a file path on the server's file system. It is depending on the server configuration, where a specific URL maps to in the file system. The simplest example is the index file that is automatically called when a domain is accessed. In the case of http://example.org the server automatically loads a file index.html in the file system, for example.
So, in order to get the file name on the file system, you need to either check the server configuration or the related access logs.

web-development: how do you usually handle the "under costruction" page"?

I was wondering what's the best way to switch a website to a temporary "under costruction" page and switch it back to the new version.
For example, in a website, my customer decided to switch from Joomla to Drupal and I had to create a subfolder for the new CMS, and then move all the content to the root folder.
1) Moving all the content back to the root folder always create some problems with file permissions, links, etc...
2) Creating a rewrite rule in .htaccess or forward with php is not a solution because another url is shown including the top folder.
3) Many host services do not allow to change the root directory, so this is not an option since I don't have access to apache config file.
Thanks
Update: I can maybe forward only the domain (i.e. www.example.com) and leave the ip on the root folder (i.e. 123.24.214.22), so the access is finally different for me and other people? Can I do this in .htaccess file ?
One thing to consider is you don't want search engines to cache your under construction page - and you also don't want them to drop your homepage from the search index either (Hence just adding a "noindex" meta tag isn't the perfect solution).
A good way to deal with this is do a 302 redirect (temporarily moved) from your homepage to your under construction page - that way the search engine does not cache your homepage as an under construction page, does not index your under construction page (assuming it has a NOINDEX meta tag), and does not drop your homepage from the search index either.
One way would be the use of an include on your template page.
When you want the construction page to show, you set a redirect in the include to take all traffic to the construction page.
When you are done your remove the redirect.
What about hijacking your index.php file?
Something simple, along the lines of
<?php
if (SITE_OFFLINE)
include 'under_construction.html';
else
//normal content of your index page
?>
where you would naturally define SITE_OFFLINE in an appropriate place for your needs.
What I did when I used PHP for websites was to configure Apache to direct all requests to a front controller. You then would have full access to all requests no matter where they are pointing to. Then in your front controller (PHP file, static html file, etc.), you would do whatever you need to do there.
I believe you need to configure pathinfo in Apache and some other settings, it has been about 3 years since I have used that approach. But, this approach is also good for developing your own CMS or application so that you have full control over security.
You have to do something similar to this:
http://www.phpwact.org/pattern/front_controller
I am looking for more details, I know my configuration had more to it than that.
This is part of what I'm looking for too:
http://httpd.apache.org/docs/2.0/mod/core.html
Enabling path_info passes path information to the script, so all requests now go through a single point of entry. Let me find my configuration, I know vaguely how this works, but I'm sure it looks like a lot of hand waving.
Also, keep in mind that because all requests are going through this single PHP file, you are responsible for serving images, JavaScript, CSS, etc. So, if a requests is coming in for /css/default.css, that will go through your php script (index.php, most likely), then you'll need to determine how to handle the request. Serving static files is trivial, but it is a little more work.
If you don't want to go that route, you could possibly do something with mod_rewrite so that it only looks for .html, .htm pages or however you have your site configured. For me, I don't do extensions, so that made my regex a little more difficult. I also wanted to secure access to all files. The path_info was the solution for me, but if you don't need that granularity, then writing a front controller might be a bit too much work.
Walter