Apache2, configuration for microsites - configuration

Please, I ned help. I'm trying configure microsites with Apache.
I have a main site "mainsite" (www.mainsite.aaa) with document root /a/b/mainsite, in advance ./mainsite for abbreviate.
I want serve some content "minisiteA", "minisiteB" and so on... with they particular domain "www.minisiteA.aaa", "www.minisiteB.aaa"... (there are about 80 minisites) , which principal content but not all, are under ./mainsite/microsite/minisiteA, ./mainsite/microsite/minisiteB. There are a lot references to scrips css, images,etc under ./mainsite/ccs ./mainsite/imgages... It is: "out of the pseudo-documentroot for minisiteX" (That is my main problem at this moment)
I try to resolve this creating a virtualhost to manage all the microsites (ServerAlias) with the same document root as the mainsite (I don´t whant to modify de mainsite vhost). But I don´t locate a soltution for this. I guess that it's a very common configuartion, but I don´t find it (I feel I don't user correct keywords).
I try with some configuration, here two of them, first one the most easiest but it don´t work with files below ./microsite/minisiteX, and the second one has the same problem. It is possible than I am far away of correct solution, so, tips and advisors are welcome.
First config.
RewriteCond %{HTTP_HOST} ^www\.minisiteA\.com:8036$
RewriteRule ^/(.+) /microsite/minisiteA/$1 [L]
Second config
RewriteCond %{ENV:REDIRECT_SUBDOMAIN} =""
RewriteCond %{HTTP_HOST} ^([a-z0-9][-a-z0-9]+)\.mydomain\.org\.?(:80)?$ [NC]
RewriteCond %{DOCUMENT_ROOT}/subdomains/%1 -d
RewriteRule ^(.*) subdomains/%1/$1 [E=SUBDOMAIN:%1,L]
RewriteRule ^ - [E=SUBDOMAIN:%{ENV:REDIRECT_SUBDOMAIN},L]

I think 'dynamically configured mass virtual hosting' is what you want here.
Given that the sites follow the same structure you can have Apache map your domain names to your local structure, i.e. by mapping the domain names to folder names using a variable - 'minisiteA' can be mapped to /local/path/minisiteA, 'minisiteB' can be mapped to /local/path/minisiteB ...
http://httpd.apache.org/docs/2.2/mod/mod_vhost_alias.html
This has saved me before, having to write 1 config rather than scripting generating hundreds and hundreds of small, almost identical configs...

Related

Use htaccess to fix misspelled urls

So I have a pretty simple problem (at least I think do) with my website. I need to be able to redirect any misspelled URLs to the correct ones. It's easier if I explain it to you guys than to describe it.
For example, let's take this url.
http://www.tomshardware.com/reviews/radeon-r9-290x-hawaii-review,3650.html
Now, that url will take you to the correct page of that article regardless of how the url is spelled. Say you accidentally place a letter, number or a word into that URL to something like this:
http://www.tomshardware.com/reviews/radeon-r9-290x-TEST-TEST-hawaii-review,3650.html
That url will still take you to the correct article and fix itself to the correct URL. You could add anything to that URL and it will still take you to the right article regardless what you accidentally type into it.
So my question is how do I do this in htaccess? This is my current htaccess file
# Secure htaccess file
<files .htaccess>
order allow,deny
deny from all
</files>
AddHandler application/x-httpd-php5 .html .htm
AddType application/x-httpd-php .html .htm .php
AddHandler cgi-script .pl .cgi
Options ALL -Indexes -Multiviews +ExecCGI +FollowSymLinks
# Do not remove this line, otherwise mod_rewrite rules will stop working
RewriteBase /
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
#Redirect Non-WWW to WWW
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
RewriteCond %{REQUEST_URI} /index\.html?$ [NC]
RewriteRule ^(.*)index\.html?$ "/$1" [NC,R=301,NE,L]
You probably can't do that in that way.
As you can observe, the text on the url is totally irrelevant and is only there to create readable and index-friendly (SEO) urls. Those words are called "slugs", see http://en.wikipedia.org/wiki/Clean_URL#Slug
If you modify the last part, the 3650 it will break the url because this is the only identifier which typically corresponds to a unique ID in the database.
Assumption on how and why the mentioned site do this:
The site uses either a standalone routing component (e.g. Routing from Symfony PHP framework: http://symfony.com/components/Routing), an entire web framework or everything is written by hand. Depending on the language it might be ZEND, Symfony, etc for PHP, MVC for Asp.net or any other.
In all cases there is some sort of filtering of urls before the original content is served.
The routing parses the url, retrieves the unique ID, fetches the data set and creates again an absolute URL out of it.
It then compares the freshly generated route with the one you have entered.
If they don't match the framework issues a http status of 30x and redirects you to the new url.
The purpose of that is to maintain link sanity when the slug tags have changed or for whatever reason the SEO friendly url layout have changed.
The redirect is there so the old fashioned urls are updated next time a search engine visits the page and updates it's index.
Imagine you have a typo somewhere in the slugs or you forgot to mention Radeon and you want to avoid having it forever broken or wrong in the DB.
So you need to fix it but at the same time you want to avoid breaking the old urls for search indexes which have not yet revisited your site with the new slugs or users that have bookmarked it.
After the redirect it again compares the urls and after they match the content is served.
A DB lookup is very likely here and you cannot do this properly with htaccess alone as you have no knowledge about correctness of the url here.
You would internal-redirect all article pages to a php program and it will match the parameters with best possible page to show
-- .htaccess --
RewriteEngine on
RewriteRule ^article/(.*).html$ /article.php?url=$1 [L]
-- php --
read article selection criteria
$article_url=$_GET['url'];
Search through database or files and show the article

URL rewrite in html

I want to do URL rewrite in html pages.
Any help for that. every one knows about url rewrite but I found all article for pp, asp.net, classic asp. so please any one knows how to do url rewrite in html.
like wise I want to rewrite from
http://www.xyz.com/aboutus.html to http://www.xyz.com/About-us
Any help will appreciated.
Thank you.
I'm not normally huge on working with RewriteEngine, or .htacess, but according to this blog entry, you can use the following code to hide file extensions:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
Just paste that in your .htaccess file, and put the changes to the server (upload the new .htaccess using an HTTP client). If your .htaccess already has RewriteEngine on, you should skip the first line.
If you want to change the URL from xyz.com/aboutus.html to xyz.com/About-Us, you also have to change the name of the file or folder aboutus to About-Us. Another possible solution would involve just having an index.html file in a folder named About-Us, which would make the server load that file automatically once a user accesses xyz.com/About-Us, and wouldn't display the filename.
To rewrite urls you need to configure the webserver to do that (for example in apache with .htaccess.
But if you want to do that without use server configuration, a bad solution exists: make a folder with the name of the url and put the html into that with the name index.php. For example about-us.html > about-us/index.html and in links put the url about-us. But is a bad solution.

Is an index file for every page the wrong way to set up a site?

My goal was to prevent the user from having to type in .html in order to access the page they are looking for on our site. On other sites I have left the file name as /pagename.html and the user could type in only /pagename and the page would load. For some reason, that was not possible with our server settings (GoDaddy Plesk Parallel server) so my workaround was to create a folder for every page I wanted and the actual file would be /index.html. My goal was accomplished and now the user doesn't have to include .html to load the page. The problem now is that Google and SEOmoz reports are reading tons of duplicate content. The reason is that the user could type in 3 different things to get to the same page - technically 6 if you include "www":
sitename.com/services
sitename.com/services/
sitename.com/services/index.html
Search engines are displaying it the 2nd way (http://sitename.com/services/) and if you type it without the "/" it redirects to showing it with the "/". SEOmoz is saying I have 301 redirects for each page in order for that to happen but we never manually did that.
I've tried creating an .htaccess file with redirects from sitename.com/services/ to sitename.com/services but the page won't load because of too many redirects.
Did I break some big rules setting it up this way?
Please note that "sitename.com/services/" is just an example of a page and our entire site of 50 pages is set up in this nature. The actual site is http://www.logicalposition.com.
The preferred way is to set up your server to manage the URL handling. If you are on an Apache server, for example, you could use the following suggestion and create/change the .htaccess file to get the desired affect.
http://eisabainyo.net/weblog/2007/08/19/removing-file-extension-via-htaccess/
The most straightforward way is to use Apache's .htaccess (which if I remember correctly GoDaddy allows access to, though I may be wrong) to do redirects.
See this post: https://stackoverflow.com/a/5730126/549346 (mods: possible duplicate?), which directs you to place something like the following in your .htacess file:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)\.html$ /$1 [L,R=301]
Firstly it sounds like you haven't done basic leg work to minimize this. You need to decide do you want www.samplesite.com or just samplesite.com? Then you can very easily set this with .htaccess (see this handy tool). This will mean at most you will have three variations, not 6.
I would take #Jassons's suggestion and use URL Handling - 2 of my clients currently use GoDaddy and both of which use this method so should be fully supported.
Some more helpful links for URL Handling/htaccess rewrites (although note: setting up 301 redirects takes time, patience and careful monitoring of crawl errors on Web Master Tools, so URL Handling is preferable!)
http://net.tutsplus.com/tutorials/other/using-htaccess-files-for-pretty-urls/
Extreme example, but still relevant :) Handling several thousand redirects with .htaccess
Edit Forcing trailing slash
You can easily force the trailing slash to appear by using the Rewrite rule
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^(.*) $1 [L]
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ $1/ [L,R=301]
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?category=$1
I think you have already done that in part, but what you will notice is there is a 301 redirect header sent, that means the as spiders visit your site they will update the URL to have the trailing slash - it won't be over night. You might be able to use Web Master Tools to speed things up in terms of changing the URLS.
Source: In part this website, it give's you a good explanation of how it works

Editing into multiple HTML Pages

I have my HTML website that contains about 330+ HTML files. But, each time I want to edit some code or tags, I get Stuck at that moment.
I searched the net throughly, but couldn't find a single Free Tool that could do this. I used Notepad++, but that Replaces the text by a single Tag.
Something like http://www.wingrep.com/ can find and replace across multiple files, if you want to change code blocks en masse.
Something to look at for the future would be using php to import your header, footer and sidebar from separate files. That way you only need to change one file.
<?php include 'header.php';?>
<?php include 'footer.php';?>
<?php include 'sidebar.php';?>
Not much use to you now but maybe the most useful thing i've come across lately :0)
To prevent RFI attacks (I came across this in an article somewhere)
Here we check our query string for http://, https:// or ftp://
RewriteCond %{QUERY_STRING} (.*)(http|https|ftp):\/\/(.*)
If you are using this rewrite within a .htaccess all you have left is to deny access from all matching requests.
RewriteRule ^(.+)$ - [F]
If you have access to your vhost you could also log those requests like this:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{QUERY_STRING} (.*)(http|https|ftp):\/\/(.*)
RewriteRule ^(.+)$ - [env=rfi:true]
</IfModule>
CustomLog /path/to/logs/rfi.log combined env=rfi
You will also have to deny access from requests that have been caught by the above rewrite
Deny from env=rfi

How to 301 root site to new folder but allow new website on root

Ok let me see if I can explain this easily
I have a forum that was hosted as my home page on www.mysite.com respectively. It's well indexed and I'd hate to lose any ranking.
Today I moved the entire root site from the root domain to www.mysite.com/forum to make way for our new CMS system which will now be the home page. (This is to help new users and easily guide visitors to our new store)
Currently I'm using this in my htaccess file
RewriteEngine on
RewriteCond $1 !^Home
RewriteCond %{HTTP_HOST} ^mysite.com$ [OR]
RewriteCond %{HTTP_HOST} ^www.mysite.com$
RewriteRule ^(.*)$ "http\:\/\/www\.mysite\.com\/forum\/$1" [R=301,L]
As you can see this takes care of redirects while still allowing me to access the cms located on /home
Here's the million dollar question:
Is there way to put the CMS onto the root domain while still redirecting all of the old forum links? I appreciate your help and hope I explained myself correctly :)
It would take quite a bit of work.
Your HTTP server won't know the difference between http://www.mysite.com/ (the old forum link) and http://www.mysite.com/ (the new CMS link).
However, and this is a big however, you can redirect all of the http://www.mysite.com/forum-link to http://www.mysite.com/forum/forum-link. You're probably going to have to write a RewriteRule for every unique forum-link you have.
One better possibility would be to put the new CMS pages at http://www.mysite.com/cms and redirect http://www.mysite.com there.
Set redirect to /forum/* for all requests except /*?no_redirect.
Write 404 error handler for /forum directory. It should redirect user to /[requested_url]?no_redirect.