Redirect Hardcoded URLs .htaccess - html

I have a very old website and it's written in plain HTML. The website used to be hosted publicly, however I downloaded an archive and I now only need it on my localhost (XAMPP) environment.
I had a look at the website code and every link is hardcoded and begins with http://example.com. The site contains thousands of links e.g.;
http://example.com/category/
http://example.com/category/item
http://example.com/category/item/details/item123.html
etc
What's the most efficient way for me to replace all of the links beginning with http://example.com/ with localhost
I could use find and replace but should I instead use a .htaccess?
I should add that the domain http://example.com no longer exists and I have no access the the hosting etc.
So for example if I click a link that's hardcoded to http://example.com/category/item/details/item123.html it should go to http://localhost/category/item/details/item123.html

Related

Why does an HREF consisting of a plain forward slash direct to the homepage?

I came across an HTML anchor which reads Home.
Normally we put something like Home but when I click on Home I am able to go to the index page on the website.
I can't replicate the behavior on localhost.
Why does \ direct to the website's homepage, and was it intentional on the developer's part?
You are correct that it is incorrect, and it's almost certainly not intentional. Backslashes (\) are considered unsafe in URLs, and if a backslash is necessary in your URL you would normally have to encode it as %5C.
Why it works
As Rocket Hazmat pointed out in a comment on your question, most browsers automatically substitute / for \ in URLs.
So the link to \ is converted to /, which requests the root of the current server. The server is probably set up to serve some default file like index.php when it receives a request for a directory, and the result is loading the homepage.
Why it doesn't work in localhost
I don't know your local http server setup, but chances are it hasn't been configured to serve a specific page (like index.php) when it receives a request for a directory. So you are likely just seeing a directory listing of whatever is at the root of the local http server you are running locally.

Replace URLs in Typo3 DB

So I have a Site created with Typo3. I also have a domain which is linked to the folder of the Typo3 Installation. www.example.org
I created a Subdomain and linked it to the same folder and used the Main Domain for something else.
But now everything on the Subdomain still has the URL Structure of the main site so when I open up sub.example.org all the Links and Images still have the URL from www.example.org/...
I exported the Database and replaced every URL with notepad++ and imported it again. But that didnt change anything. What do I do wrong?
There are two (three with realurl) places where you need to look if changing the domain of a TYPO3 site, if everything is done by the book and noone hardcoded the domain all over the place or something.
Usually you do not need to work in the database directly.
After doing the changes, make sure to clear the caches (install tool in 6.2+, "all caches" in earlier versions).
First:
There are two TypoScript settings that influence the generated URL: config.baseURL and config.absRefPrefix.
The recommended way to use those is to not set config.baseURL (would result in a <base> tag in the HTML <head>), and set config.absRefPrefix to the subpath where TYPO3 is, relative to the document root. If TYPO3 lies directly in the document root, set it to /.
Second:
In the database, there are "Domain Records". They are usually located on the root page of a site. Change those to the new domain.
Third (with realurl only):
Check the realurl configuration file, usually located in typo3conf/realurl_conf.php. Depending on your setup, the old domain name is used there and needs to be changed.

Search and replace from a list of replacements

Using html import 2 plugin for wordpress, I've gathered a list of old file-paths and what they've been changed to.
Instead of bulking up my .htaccess with redirects, I was hoping there was a way to replace all the old links with the new ones.
For instance, i have a list:
oldlink1, newlink1
oldlink2, newlink2
oldlink3, newlink3
oldlink4, newlink4
and I want to replace every occurence of oldlink1 with newlink1. possible?
Are you asking about file and images paths in the post content? (1) Or are you asking about post permalinks to redirect the old .html URLs to new WordPress URLs without the .html suffix? (2)
1) For file image paths in post content, probably the easiest and most foolproof approach is to use a find/replace plugin that will provide a front end to the database so you don't run queries directly on the database.
Try http://wordpress.org/plugins/search-regex/ With it, you can find/replace post content, post meta, comment content, etc.
Search Regex adds a powerful set of search and replace functions to
WordPress. These go beyond the standard searching capabilities, and
allow you to search and replace almost any data stored on your site.
In addition to simple searches you have the full power of PHP's
regular expressions at your disposal.
2) For URL redirects, you can try http://wordpress.org/plugins/redirection/ which will allow a CSV import of URL redirects while logging 404s and redirects.
Redirection is a WordPress plugin to manage 301 redirections and keep
track of 404 errors without requiring knowledge of Apache .htaccess
files.... This is particularly useful if you are migrating pages from
an old website, or are changing the directory of your WordPress
installation.
With this script you can do a search & replace on your WordPress database. Just follow the instructions, and don't forget to remove the script after you're done.

problems with file directory ftp

i'm new in using hosting, i have a question about a FTP, why if i upload something (for example a image) to my server i cannot see it from the browser using the directory for example (http://www.mywebsite.com/public_html/images/backgrounds/background.png) if i use that address i get a fil with a "?" sign instead of the image. the only way to see the image is changing http by ftp for example,(ftp://ftp.mywebsite.com/public_html/images/backgrounds/background.png)
please how to find the files with http instead of ftp, to be able to use it in my web page using html
thank you
Typically, the publichtml folder is the root of your domain, which is to say that http://www.mywebsite.com/ points to your/relative/path/to/publichtml/
Using your example of putting a file at /publichtml/images/backgrounds/background.png would mean it should be accessible at http://www.mywebsite.com/images/backgrounds/background.png
Similarly, if you put filename.html in the /publichtml/ folder of your server, you should be able to access it at http://www.mywebsite.com/filename.html - If you put it in a subfolder of /publichtml/, say, at publichtml/example/, it should be accessible at http://www.mywebsite.com/example/filename.html
This can very from one server to another, but in most situations, this is common practice.
Edit: broken formatting.

Do I need to specify a webpage's url?

I've uploaded several files to my server and it's really quite baffling. The home page is saved as index.html, and when I type in the URL of said page it miraculously, and quite successfully shows the right page. What about my other pages? I have linked to them from the home page with the following code:
About Us
How does my html file, presumably called about.html, supposed to know that its URL is "http://www.example.com/about/"? I am dubbing this "The Unanswered Question" because I have looked at numerous examples of metadata and there is nothing about specifying the URL of a page.
It depends on what type of server you are running.
Static web servers
If it is the simplest kind of static file server with no URL aliasing or rewriting then URLs will map directly to files:
If your "web root" was /home/youruser/www/, then that means:
http://www.example.com -> /home/youruser/www/
And any paths (everything after the domain name) translate directly to paths under that web root:
http://www.example.com/about.html -> /home/youruser/www/about.html
Usually web servers will look automatically for an "index.html" file if no file is specified (i.e. the URL ends in a /):
http://www.example.com/ -> /home/youruser/www/index.html
http://www.example.com/about/ -> /home/youruser/www/about/index.html
In Apache, the filename searched for is configurable with the DirectoryIndex directive:
DirectoryIndex index.html index.txt /cgi-bin/index.pl
That means that every request to a path that ends in a / (and to add yet another rule, under some common settings it will automatically append a / if the path is the name of a directory, for example 'about'):
http://www.example.com/ -> /home/youruser/www/index.html
-> or /home/youruser/www/index.txt
-> or /home/youruser/www/cgi-bin/index.pl
Web servers with path interpretation
There are too many different types of servers which perform this functionality to list them all, but the basic idea is that a request to the server is captured by a program and then the program decides what to output based on the path.
For example, a program might perform different routes for basic matching rules:
*.(gif|jpg|css|js) -> look for and return the file from /home/user/static
blog/* -> send to a "blog" program to generate the resulting page
using a combination of templates and database resources
Examples include:
Python
Java Servlets
Apache mod_rewrites (used by Wordpress, etc.)
Links in HTML pages
Finally, the links in the HTML pages just change the URL of the location bar. The behavior of an HTML link is the same regardless of what exists on the server. And the server, in turn, only responds to HTTP requests and only produces resources (HTML, images, CSS, JavaScript, etc.), which your browser consumes. The server only serves those resources and does not have any special behavioral link with them.
Absolute URLs are those that start with a scheme (such as http: as you have done). The whole content of the location bar will be replaced with this when the user clicks the link.
Domain relative URLs are those that start with a forward slash (/). Everything after the domain name will be replaced with the contents of this link.
Relative URLs are everything else. Everything after the last directory (/) in the URL will be replaced with the contents of this link.
Examples:
My page on "mydomain.com" can link to your site using the Example.com about just as you have done.
If I change my links to about then it will link to mydomain.com instead.
An answer your question
How does my html file, presumably called about.html, supposed to know that its URL is "http://www.example.com/about/"?
First, the file itself has no idea what its URL is. Unless:
the HTML was dynamically generated using a program. Most server-side languages provide a way to get this.
after the page is served, client-side scripts can also detect the current URL
Second, if the URL is /about and the file is actually about.html then you probably have some kind of rewriting going on. Remember that paths, in their simplest, are literal translations and /about is not the same as about.html.
Just use /about.html to link to the page
Theoretically, it's better for URLs in your documents to be relative, so that you don't have to change them in the event you change the domain or the files location.
For example, if you move it from localhost to your hosted server.
In your example, instead of www.example.com/about.html use /about.html.
Given the link above you would need a about page named index.html located in a directory named about for your example to work. That is however not common practice.
I'm a bit confused, but here is some information. Any file named "index" is the default display page for any directory(folder) when trying to view that directory.
All files in a folder are always relative to that directory. So if your link is in a file, within a different directory, then you must type in that directory along with the file. If it is the same directory, then there is no need to type in that directory, just the file name.
about.html doesn't know what it's URL is, its the index.html file that calls your about.html file.
When you're in any given directory, linking to other pages within that directory is done via a simple relative link:
About Us
Moving up a directory, assuming you're in a sub folder (users) perhaps you can use the .. operator to navigate up one directory:
About Us
In your case your about page is in the same directory as the page you're linking from so it just goes to the right page.
Additionally (and I think this may be what you're asking) if you have:
about.html
about.php
about.phtml
about.jpg
for example, and you visit http://www.yoursite.com/about it will automatically bring up the html page and the other pages should be referenced explicitly somewhere if you want them to be used.