Regex to avoid "dirty" web links - html

If this is my actual URL to a file:
http://www.example.org/posts.php?post=example-post-name
In my .htaccess file, how can I use a regular expression to get to this path when a user submits:
http://www.example.org/posts/example-post-name
So far I've come up with this bringing together a few examples (this also included a www redirect):
RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.(.*) [NC]
RewriteRule ^(.*) http://%1/$1 [R=301,L]
RewriteCond %{REQUEST_URI} !(\.[^./]+)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) /$1.php [L]
RewriteRule ^posts/([A-Za-z])/$ /posts.php?post=$1
But I'm not having much luck with it, can anyone tell me where I'm going wrong?

You need a + after your A-Za-z group to indicate one or more characters, and also you need to add a - to the end of that group. At the end, the /? indicates that the final slash may or may not be present.
Finally, add [L] to be sure no further rewrite rules get processed.
RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.(.*) [NC]
RewriteRule ^(.*) http://%1/$1 [R=301,L]
# First rewrite the posts:
RewriteRule ^posts/([A-Za-z-]+)/?$ /posts.php?post=$1 [L]
# ing0 edit: add in dirs that need changing back.
# (I dont know if there is an easier way to do this).
RewriteRule ^posts/css/(.*)$ /css/$1 [L]
RewriteRule ^posts/img/(.*)$ /img/$1 [L]
# etc
# Then, if it's not a real file and doesn't already end in .php
# Note change here ...
RewriteCond %{REQUEST_URI} !\.php$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# redirect it to PHP.
RewriteRule (.*) /$1.php [L]

I think you need to match the whole url and the regex wasn't quite right. Try this:
RewriteRule ^(.*)/posts/([\w-]+)$ $1/posts.php?post=$2
If it works only matching the non-base part of the url, this this:
RewriteRule ^posts/([\w-]+)$ posts.php?post=$1

Why not use:
RewriteRule ^posts/(.*)/$ posts.php?post=$1

Related

RewriteEngine config

My website is coded on a way that instead send on from, input, etc thigs such as index.php?subtopic=register&step=1, index.php?subtopic=account&page=login, etc it sends out just a /account, or /register... I'm a amateur when it comes to web, but I searched a little and I started to make a .htaccess file like this:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/?$ /index.php?subtopic=$1 [L,QSA]
RewriteRule ^home/([^/]+)/?$ /index.php?subtopic=home [L,QSA]
RewriteRule ^characters/([^/]+)/?$ /index.php?subtopic=characters&name=$1 [L,QSA]
RewriteRule ^register/([^/]+)/?$ /index.php?subtopic=register&step=$1 [L,QSA]
It does seen to work, but soon I realized that there is too many variables to fill, it does not seem right.. Sorry for ask, I think that it should be very simple, but I'm not figuring out this..
From the comments to the question it becomes clear that your are asking for a more flexible way to handle the different tokens before the first slash ("characters" or "register").
You certainly can grab those tokens by a pattern too in a flexible manner:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/?$ /index.php?subtopic=$1 [L,QSA]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/([^/]+)/?$ /index.php?subtopic=$1&name=$2 [L,QSA]
Certainly you will have to make sure that this strategy actually is correct for all requests matching that pattern. So you might have to add some exception to that general rule...
This would be a slightly modified version which I would recommend to prefer. Using an optional leading slash (^/? instead of just ^) allows to use the same rule set in the real host configuration which always is preferable to using .htaccess style files:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^/?([^/]+)/?$ /index.php?subtopic=$1 [L,QSA]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^/?([^/]+)/([^/]+)/?$ /index.php?subtopic=$1&name=$2 [L,QSA]

How to set-up different URLs for yii frontend and backend

For example, if I want localhost/basic.com for the front-end and localhost/basic/admin for the back-end.
I am new to the framework, however, it seems really interesting.
Please help.
I have watched tutorials and all but none seem to cover this.
You can use the following .htaccess
Options FollowSymLinks
AddDefaultCharset utf-8
<IfModule mod_rewrite.c>
RewriteEngine On
# the main rewrite rule for the frontend application
RewriteCond %{REQUEST_URI} !^/(backend/web|admin)
RewriteRule !^frontend/web /frontend/web%{REQUEST_URI} [L]
# redirect to the page without a trailing slash (uncomment if necessary)
#RewriteCond %{REQUEST_URI} ^/admin/$
#RewriteRule ^(admin)/ /$1 [L,R=301]
# the main rewrite rule for the backend application
RewriteCond %{REQUEST_URI} ^/admin
RewriteRule ^admin(.*) /backend/web/$1 [L]
# if a directory or a file of the frontend application exists, use the request directly
RewriteCond %{REQUEST_URI} ^/frontend/web
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# otherwise forward the request to index.php
RewriteRule . /frontend/web/index.php [L]
# if a directory or a file of the backend application exists, use the request directly
RewriteCond %{REQUEST_URI} ^/backend/web
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# otherwise forward the request to index.php
RewriteRule . /backend/web/index.php [L]
RewriteCond %{REQUEST_URI} \.(htaccess|htpasswd|svn|git)
RewriteRule \.(htaccess|htpasswd|svn|git) - [F]
</IfModule>
FUll configurations can be found at: https://github.com/mickgeek/yii2-advanced-one-domain-config

Need to remove .html file extension and duplicate names

I have a .htaccess file with the contents below, that removes the .html file extension for all of my website's pages.
Options +MultiViews
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\.]+)$ $1.html [NC,L]
RewriteEngine On
RewriteCond %{SERVER_PORT} !=443
RewriteCond %{HTTP_HOST} ^(www\.)?james-lee\.io$ [NC]
RewriteRule ^$ https://www.james-lee.io%{REQUEST_URI} [R,L]
My links now look like www.james-lee.io/resume/resume when before they looked like www.james-lee.io/resume/resume.html. I would like to remove the folder name so the name of the folder is not duplicated by the name of the file minus the .html and the final result looks like www.james-lee.io/resume.
I have seen similar questions but not exactly what I am looking for.
So I have done this task!
Try this code:
RewriteCond %{REQUEST_URI} ^/(.*)/(.*)$
RewriteCond %{DOCUMENT_ROOT}/%1 -d
RewriteCond %{DOCUMENT_ROOT}/%1/%2 -f
RewriteCond %1::%2 ^(.*)::\1$
RewriteRule ^(.*)$ /%1 [R,L]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.*)/(.*)$ /$1/$1 [END]
Now I try to explain this rules.
first line: you do request like /folder/file
second line: check if /folder/ real existing folder
third line: check if /folder/file is real existing file
fourth line: I use notation %1::%2 because backreferences can only
be used in the left part of RewriteCond. But it possible to reuse
left part in pattern of the right part. So, in the "^(.*)::\1$" I
check all before ::. Then I have result at the \1 backreference. So,
if folder is equal to file, text after :: will be equal to %2.
Next I just redirect to the result (/folder or /file, doesn't
matter, because both are equal)
But if folder == file, redirect will be always to the directory.
So, next I check, if redirect result is existing dir and change the link.
Request example:
http://yourdomain/test/test
(this will be redirected to http://yourdomain/test, but will reference to original link)
I hope, I explain clearly, but if you have any questions, I would glad to answer.
Thank you for insteresting task!
P.S. see also %N backreference inside RewriteCond
UPDATED. Your htaccess have to be like below:
RewriteEngine on
RewriteCond %{SERVER_PORT} !=443
RewriteCond %{HTTP_HOST} ^(www\.)?james-lee\.io$ [NC]
RewriteRule ^$ https://www.james-lee.io%{REQUEST_URI} [R,L]
RewriteCond %{REQUEST_URI} ^/(.*)/(.*)$
RewriteCond %{DOCUMENT_ROOT}/%1 -d
RewriteCond %{DOCUMENT_ROOT}/%1/%2\.html -f
RewriteCond %1::%2 ^(.*)::\1$
RewriteRule ^(.*)$ /%1 [R,L]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.*)/(.*)$ /$1/$1.html [END]
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\.]+)$ $1.html [NC,L]

Remove file extension with .htaccess: Error with trailing slash

I want to remove the file extension like .html from my websites with .htaccess. The final structure should be like so:
http://domain.com/file --> http://domain.com/file.html
http://domain.com/file/ --> http://domain.com/file.html
With my existing code in .htaccess I'll get "Internal Server Error" on my Browser when there's a trailing slash at the end. What can I do? Thanks!
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
RewriteEngine On
RewriteBase /
RewriteRule ^([a-zA-Z0-9-_]+)/?$ $1.html [L]
I suggest you to change your RewriteCond :
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)(\.html){0}$ /$1.html [L]
EDIT : rule edited, I forgot infinite loop.

Exclude certain subfolders and domains in redirects

This is a continuation from Redirect only HTML files?
How can I change my .htaccess to make it exclude certain subfolders or subdomains from the HTML-only redirect? I tried doing using this code to exclude the 'downloads' subfolder and the 'dev' and 'support' subdomains, but it didn't work:
RewriteCond %{HTTP_HOST} ^pandamonia.us$ [OR]
RewriteCond %{HTTP_HOST} ^www.pandamonia.us$ [OR]
RewriteCond %{HTTP_HOST} !download [OR]
RewriteCond %{HTTP_HOST} !faq
RewriteCond %{HTTP_HOST} !support [OR]
RewriteRule /.+\.html$ "http\:\/\/pandamonia\.us\/" [L]
You need to check REQUEST_URI or the whole match of the RewriteRule $0 for this; HTTP_HOST does only contain the host name of the current request. You also need to change the logical expression of your condition:
RewriteCond %{HTTP_HOST} ^pandamonia\.us$ [OR]
RewriteCond %{HTTP_HOST} ^www.pandamonia\.us$
RewriteCond %{REQUEST_URI} !^/download/
RewriteCond %{REQUEST_URI} !^/faq/
RewriteCond %{REQUEST_URI} !^/support/
RewriteRule /.+\.html$ http://pandamonia.us/ [L]
For those looking for a quick bit of insight into Gumbo's previous reply (where he mentions the situations for when to (and not to) use [OR], I found this WMW thread very helpful: http://www.webmasterworld.com/apache/3522649.htm