File and folder with same name - html

How do I make it so this:
www.example.com/services
Will point to this file:
/services.html
but this:
www.example.com/services/pocket-pool
Will point to this directory:
/services/pocket-pool.html
I have these rewrite rules in my .htaccess:
RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [R=301]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^/?([^\.]+)$ $1.html [L]
ErrorDocument 404 /404.html
Thanks!

Well, your nearly have your solution, you just have to extend it to take care of the second situation you want to handle:
RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [R=301]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^/?([^\.]+)$ $1.html [L]
RewriteRule ^/?services/(\w+)/?$ /services/$1 [END]
ErrorDocument 404 /404.html
In case you receive an internal server error (http status 500) using the rule above then chances are that you operate a very old version of the apache http server. You will see a definite hint to an unsupported [END] flag in your http servers error log file in that case. You can either try to upgrade or use the older [L] flag, it probably will work the same in this situation, though that depends a bit on your setup.
This rule will work likewise in the http servers host configuration or inside a dynamic configuration file (".htaccess" file). Obviously the rewriting module needs to be loaded inside the http server and enabled in the http host. In case you use a dynamic configuration file you need to take care that it's interpretation is enabled at all in the host configuration and that it is located in the host's DOCUMENT_ROOT folder.
And a general remark: you should always prefer to place such rules in the http servers host configuration instead of using dynamic configuration files (".htaccess"). Those dynamic configuration files add complexity, are often a cause of unexpected behavior, hard to debug and they really slow down the http server. They are only provided as a last option for situations where you do not have access to the real http servers host configuration (read: really cheap service providers) or for applications insisting on writing their own rules (which is an obvious security nightmare).

Related

Removing .html extension from URL

So I would like to be able to go to example.com/page.html from example.com/page (without the .html extension). I've been looking around and it seems to have something to do with the .htaccess file, but I have no idea how to actually configure the rule. Thanks for the help!
Update:
I've only really tried various suggestions I found online, for example:
RewriteCond %{THE_REQUEST} \s/+(.+?)\.html[\s?] [NC]
RewriteRule ^ /%1 [R=302,L,NE]
# To internally forward /dir/file to /dir/file.html
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{DOCUMENT_ROOT}/$1\.html -f [NC]
RewriteRule ^(.+?)/?$ /$1.html [L]
This probably is what you are looking for:
RewriteEngine on
RewriteCond %{REQUEST_URI} -f
RewriteRUle ^/?(.+)\.html$ /$1 [R=301]
RewriteCond %{REQUEST_URI} !-f
RewriteCond %{REQUEST_URI} !-d
RewriteCond %{REQUEST_URI}.html -f
RewriteRule ^/?(.+)/?$ /$1.html (END)
This rule will work likewise in the http servers host configuration or inside a dynamic configuration file (".htaccess" file). Obviously the rewriting module needs to be loaded inside the http server and enabled in the http host. In case you use a dynamic configuration file you need to take care that it's interpretation is enabled at all in the host configuration and that it is located in the host's DOCUMENT_ROOT folder.
And a general remark: you should always prefer to place such rules in the http servers host configuration instead of using dynamic configuration files (".htaccess"). Those dynamic configuration files add complexity, are often a cause of unexpected behavior, hard to debug and they really slow down the http server. They are only provided as a last option for situations where you do not have access to the real http servers host configuration (read: really cheap service providers) or for applications insisting on writing their own rules (which is an obvious security nightmare).
In case you get an "internal server error" (http status 500) using above rule set chances are that you operate a very old version of the apache http server. You will find a definite hint to an unsupported [END] flag in your http servers error log in that case. Either upgrade your http server to a more or less current version or try using the older [L] flag in that case, it should work the same, though that depends a bit on your setup.
URL example.com/page will be loading content from file /page.html
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} !-f
RewriteCond %{REQUEST_URI} !-d
RewriteRule ^(.*)$ $1.html [L]

How to htaccess 301 redirect pages with a question mark in the url

I'm trying to redirect several pages that all have question marks in the URL.
I essentially want to redirect:
www.example.com/?attachment_id=456 to www.example.com
There's a ton of pages with differend id #s also.
I've tried a few things in htaccess with no luck..
Any ideas?
This is what I tried:
RewriteCond %{QUERY_STRING} ^attachment_id=[0-9]
RewriteRule ^/$ http://www.example.com/? [L,NC,R=301]
Why can't you do this? This code should redirect a URL like this www.example.com/?attachment_id=456
RewriteCond %{QUERY_STRING} ^attachment_id=[0-9]+
RewriteRule ^/?$ http://www.example.com/? [L,NC,R=301]
I made the / optional so that it can be used in Apache config or .htaccess. Also I kept the ? that you have in the redirect at the end of the RewriteRule to remove any query strings on redirect.
Your approach is next to perfect, just some minor corrections:
RewriteEngine on
RewriteCond %{QUERY_STRING} attachment_id=[0-9]+
RewriteRule ^/$ http://www.example.com/ [L,R=301]
The above is the version for the host configuration. note that you have to restart the http server after having made changes to the host configuration for them to get effective. To debug refer to the http servers error log file, especially at restart time.
If you have to rely on .htaccess style files, then the syntax for the rule itself must unfortunately be slightly different:
RewriteEngine on
RewriteCond %{QUERY_STRING} attachment_id=[0-9]+
RewriteRule ^$ http://www.example.com/ [L,R=301]
Such file has to be located in the main folder of the document root of the host. also the interpretation of such files must be enabled in the host configuration by means of the AllowOverride option.
In general you should always prefer the host configuration for such rules over .htaccess style files, but you need administrative access for that. .htaccess style files are notoriously error prone, hard to debug and really slow the server down.

Automatically Rewrite html,php,css or any file extension using htaccess?

is there a way to automatically remove the file extension from the url?
for ex:
the user types
www.website.com/page.html
it will automatically convert the page to:
www.website.com/page
and if possible with a /:
www.website.com/page/
it must be automatic that the .htaccess will force the url to rewrite the file extension.
You try to do this wrong way. You do not want to remove extensions. You want to rewrite extensionless requests to real files. So if user type
http://foo.bar/file
your server would serve as it was requested
http://foo.bar/file.html
To do this you need mod_rewrite (or equivalent) and set rewriting rules according to your needs. The same module is also used to make URLs looking nicer, so instead of
http://foo.bar/script.php?id=34&smth=abc
you can have
http://foo.bar/script/id/34/smth/abc
or even
http://foo.bar/script/34/abc
Read more on mod rewrite.
I use this in .htaccess (index.php is handling the pages):
RewriteRule ^home/$ index.php?page=home
// Result: http://domain.com/home/
If you want it more dynamic:
RewriteRule ^(.*)/$ index.php?page=$1
// Result: http://domain.com/any-page/
For more data you might need when requesting a page:
RewriteRule ^(.*)/(.*)/$ index.php?page=$1&id=$2
// Result: http://domain.com/data-page/15/
If you want to just get the called page without the index part:
RewriteRule ^(.*)/$ $1.php
To remove extensions the user adds:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteCond %{REQUEST_FILENAME}\.html -f
You need to do 2 things. First you need to change all of your links to the URLs without extensions, then you need to create a 301 redirect to redirect browsers and bots that still may have the old URLs with extensions to the new nice looking ones. This a server response of a new location, it's not a rewrite (which is an on-the-server-only internal URI rewrite).
RewriteEngine On
RewriteCond %{THE_REQUEST} ^(GET|HEAD)\ /(.*)\.(html?|php)
RewriteRule ^ /%2/ [L,R=301]
This makes it so when someone types http://www.website.com/page.html in their browser, they get redirected to http://www.website.com/page/ and the new URL will appear in the browser's URL address bar.
Now the second thing you need to do is internally change it back to the valid resource, since /page/ doesn't exist, it'll return a 404.
# need these so we don't clobber legit requests, just pass them through
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]
# now check the extensions
RewriteCond %{REQUEST_URI} ^/(.*?)/?$
RewriteCond %{DOCUMENT_ROOT}/%1.php -f
RewriteRule ^ /%1.php [L]
RewriteCond %{REQUEST_URI} ^/(.*?)/?$
RewriteCond %{DOCUMENT_ROOT}/%1.html -f
RewriteRule ^ /%1.html [L]
RewriteCond %{REQUEST_URI} ^/(.*?)/?$
RewriteCond %{DOCUMENT_ROOT}/%1.htm -f
RewriteRule ^ /%1.htm [L]
You'll just need to put these in the htaccess file in your document root.

How to force http- NOT https using htaccess

I have ONE directory for my entire domain that I want to force https, which is "/docs". In the /docs folder, I have the following htaccess file:
RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
This is forcing https to everything in the /docs directory, which is what I want it to do. The problem I am having is trying to force REMOVE https back to http for all other areas of my site. In the root folder of the site (which is running wordpress), I have the following htaccess file:
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
RewriteCond %{HTTPS} on
RewriteCond %{REQUEST_URI} !^/docs/?.*$
RewriteRule ^(.*)$ http://www.mydomain.com/$1 [R=301,L]
Unfortunately, this is not working. I can still access other areas of my site over https.
What do I need to change to get this to work correctly?
Since the accepted answer doesn't actually answer the question, I figured I'd post my solution to this. Add this to your .htaccess file to force HTTP instead of HTTPS:
# BEGIN Force HTTP
RewriteEngine On
RewriteCond %{SERVER_PORT} 443
RewriteRule ^(.*)$ http://yourdomain.com/$1 [R=301,L]
# END Force HTTP
Try the Force non-SSL plugin for wordpress.
The "WordPress Force HTTP" plugin was the only thing that worked for me. It changes https to http for not just the front page like most of the answers out there, but also changes https to http for all sub-directories in your website.
https://en-au.wordpress.org/plugins/wp-force-http/
Why do you need to revert back to http? If you have the proper SSL certificates you might as well keep your access secure. Unless you are concerned about the load on your system.
I know this is not answering the question, but I want to emphasize that the question is asking on how to do a bad practice, which shouldn't be done in the first place.

.htaccess rewriterule /state/city/

This will take a bit of explanation so I hope I don't lose everyone here.
I needed to get something like the following:
http://example.com/results.html?state=iowa&city=davenport
turned into:
http://example.com/iowa/davenport/
I was able to accomplish this with the use of these two rewriterules:
RewriteRule ^([A-Za-z0-9-]+)/?$ cities.html?state=$1 RewriteRule
^([A-Za-z0-9-]+)/([A-Za-z0-9-]+)/?$ results.html?state=$1&city=$2
The problem is that in the backend there is "some code somewhere" that is getting broken as a result of the second rewriterule. It has to do with filling in a select box based on the results of another one selected (I don't think that matters though). I think the problem is in that I'm modifying too broadly the /state/city.
Here is a copy of my full (modified for security) .htaccess file:
IndexIgnore *
AddHandler application/x-httpd-php5 .html .htm
RewriteRule ^([A-Za-z0-9-]+)/?$ cities.html?state=$1
RewriteRule ^([A-Za-z0-9-]+)/([A-Za-z0-9-]+)/?$ results.html?state=$1&city=$2
<Files .htaccess>
order allow,deny
deny from all
</Files>
RewriteEngine On
RewriteCond %{HTTP_HOST} ^example.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]
<IfModule mod_suphp.c>
suPHP_ConfigPath /home/USER
<Files php.ini>
order allow,deny
deny from all
</Files>
The code that its screwing up is very complex and its someone else's code. After a couple of hours I've been unable to wade through all of their stuff to even come close to what I may be able to change on their end to get things working.
Does anyone have ANY ideas on what I could do to avoid this problem? I really only have 3 .html files that I'm funneling my frontend code through so I had tried something like a
my rewriterules
and same with using just "files" instead of filesMatch. Everything I've come up with breaks something else or the entire site in one way or another.
First: (i) hostgator won't enable or give you access to rewrite logs; (ii) your suPHP config has syntax errors and hostgator almost certainly does spme of this and the .htacess / php.ini denials in its own root / vhost configs. However, I'll focus on the mod_rewrite elements:
RewriteEngine On
RewriteRule ^([A-Za-z0-9-]+)/?$ cities.html?state=$1
RewriteRule ^([A-Za-z0-9-]+)/([A-Za-z0-9-]+)/?$ results.html?state=$1&city=$2
RewriteCond %{HTTP_HOST} ^example.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]
I am also assuming that you don't have any .htaccess files in subdirectories with the rewrite engine enabled as these could preempt this under rewrite "Per Directory" precedence rules.
Rules (3) is a simple domain redirector. Rule (4) is a draconian: redirect any URI which is not an existing file or directory to index.php in the current directory, but leaving the query string intact.
Rule (1) and (2) are your new rules. As Mike says, you should include the [L] but since the files cities.html and results.html exist it won't match anyway.
I am curious as to why the trailing slach in the URIs is optional. Better to decide and to fix this.
The issue is that the match criteria for (1) and (2) are two broad and are picking up URIs intended for the general catchall (4). You need to lock this down to make these mutually exclusive. One why is to mine your access logs (which are available with hostgator) to find the standard URIs which the application expects and check that none match (1) or (2) -- However, since most will include a ".", this probably isn't the case. But check.
The other issue is whether the existing scripts use absolute or relative references e.g. <img src="images/myimage.png"> in any output HTML. Here the browser has asked for http://www.example.com/texas/houston say and will therefore look for http://www.example.com/texas/images/myimage.png which doesn't match (1), (2) or (3) and therefore is caught by (4) and passed to /index.php. Ditto CSS files etc. Hence they won't 404 and index.php will get confused and send some default response which will hopelessly confuse the browser.
However, again analysis of the access logs (in this case or USIs with a referrer http://www.example.com/texas/houston) will show you if this is going on.
If your app uses standard subdirectories then you might be able to fix this by a rule (3.1) which looks something like
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond $1/$2 -f
RewriteRule .*?(images|css|styles)/(.+) $1/$2 [L]
though the details will depend on the rest of your application.
I was able to solve it by changing my (relevant) .htaccess entries to the following:
RewriteEngine On
RewriteRule ^([A-Za-z0-9-]+)/?$ cities.html?state=$1
RewriteCond %{REQUEST_URI} !^/signup/
RewriteRule ^([A-Za-z0-9-]+)/([A-Za-z0-9-]+)/?$ results.html?state=$1&city=$2
RewriteCond %{HTTP_HOST} ^example.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]
The addition being:
RewriteCond %{REQUEST_URI} !^/signup/
HostGator was able to find that the issue was /signup somewhere in a log somewhere, never did find out which log they were able to look at but I assume it was something I didn't have access to.