Search and replace from a list of replacements - mysql

Using html import 2 plugin for wordpress, I've gathered a list of old file-paths and what they've been changed to.
Instead of bulking up my .htaccess with redirects, I was hoping there was a way to replace all the old links with the new ones.
For instance, i have a list:
oldlink1, newlink1
oldlink2, newlink2
oldlink3, newlink3
oldlink4, newlink4
and I want to replace every occurence of oldlink1 with newlink1. possible?

Are you asking about file and images paths in the post content? (1) Or are you asking about post permalinks to redirect the old .html URLs to new WordPress URLs without the .html suffix? (2)
1) For file image paths in post content, probably the easiest and most foolproof approach is to use a find/replace plugin that will provide a front end to the database so you don't run queries directly on the database.
Try http://wordpress.org/plugins/search-regex/ With it, you can find/replace post content, post meta, comment content, etc.
Search Regex adds a powerful set of search and replace functions to
WordPress. These go beyond the standard searching capabilities, and
allow you to search and replace almost any data stored on your site.
In addition to simple searches you have the full power of PHP's
regular expressions at your disposal.
2) For URL redirects, you can try http://wordpress.org/plugins/redirection/ which will allow a CSV import of URL redirects while logging 404s and redirects.
Redirection is a WordPress plugin to manage 301 redirections and keep
track of 404 errors without requiring knowledge of Apache .htaccess
files.... This is particularly useful if you are migrating pages from
an old website, or are changing the directory of your WordPress
installation.

With this script you can do a search & replace on your WordPress database. Just follow the instructions, and don't forget to remove the script after you're done.

Related

Website auto-detect SHTML pages

We just received a code for a website. Nearly all of the links are broken because they didn't add .shtml as a file extension to any of the links.
Is this due to them simply forgetting or is there some configuration where we would basically tell the website to automatically detect ".shtml" pages?
Thanks!
There are numerous ways for a server to map a URL onto a file based on a word in the URL with the addition of an extension (either a hard coded one or one provided in a list).
The obvious ones that spring to mind for Apache HTTP (your milage will vary with your web server) are multiviews and mod_rewrite.

Create a unique URL like facebook

How exactly does one do something like create a unique URL.
Like how facebook does it facebook.com/mynamehere
One way would be to create multiple folders each time we have a new user..but that doesn't seem to be the best approach
You can try a program like Elgg if you are trying to build a social media site. Otherwise, a person's profile can be custom in a couple of ways. Most of them mentioned. You, as mentioned, can use .htaccess for rewrites. You can use an automated custom url plugin (this may help: How to generate a custom URL from a html input?). Similarly, you can use the previously mentioned Elgg for social media, and but also as a last resort can use your folder method, but only if absolutely required.
I think the question is: how is it done technically, so we don't need to have physical file for every valid URL?
The answer is URL rewriting. In case of Apache server, you want to enable mod_rewrite and configure it to translate particular URL pattern (like myfbclone.com/mynamehere to myfbclone.com/index.php?username=mynamehere). This way you need to have one script file that handles all the URLs accordingly.
Different servers have different means of rewriting URLs, like Nginx or IIS, so the exact way of configuration depends on your server, but the concept is usually the same.

html directory listing formatting

So, I've been trying to get a web page to display links to videos (over a symbolic link) dynamically (i.e., without hardcoding an <a></a> tag for each one) I have, and I think I may have found a solution, albeit a hacky one:
Video
Ignoring that this is a horrible way to do this, does anyone know how to format the following?:
I'm guessing there is an apache config file somewhere, but it is extremely hard to search for it as I do not know what it is called when files are just listed in this manner.
i'm basically looking to resize the widths of columns, and maybe even do some pretty-fication.
this is all running on my web/file server and is being accessed form my local machine.
This is what you're looking for:
http://perishablepress.com/better-default-directory-views-with-htaccess/
This tutorial details how directory listing by Apache can be modified to suit your taste using HTAccess file.
Using Apache HeaderName and ReadmeName directives and the module "mod_autoindex.c" you can add custom markup to your directory listing pages.
For displaying links to A/V and other files, look at my website: https://wrcraig.com/ApacheDirectoryDescriptions.
It goes beyond the default directory description, providing a spreadsheet to assist in creating detailed descriptions and exporting them in FancyIndex/AddDescription format for inclusion in .htaccess.
It also provides a menu driven BASH scripted alternative, using the FancyIndex descriptive data above (automatically adding A/V durations) to recursively populate a custom index.html while retaining the security features of .htaccess.
The site has examples of the input spreadsheet and both the FancyIndex output and the optional BASH scripted output.

Why doesn't Wikipedia have extensions?

Look at a random wikipedia article like http://en.wikipedia.org/wiki/Impostor_syndrome, I see that there's no .html attached to the end of the address. In fact, if I do try to put a .html after it, Wikipedia tells me "Wikipedia does not have an article with this exact name." How come it doesn't need any file extensions?
More a superuser question?
There is no law saying that an html file has to end in .html or .htm and since wiki generates pages from a database there is really no file page there anyway (except in a cache).
Not having .htm or .php is moresensible - why do you care what technology they use when you ask for a url? It would be like having to put the operating system of the recipient at the end of their email address.
if you make a call to a website it probably looks like
www.example.com/siteA/index.html
this request just tells the webserver you want to see a resource that is called index.html in siteA.
the website that runs on this server has to determine what you want to see and how the data is loaded.
index.html could be a file in the siteA directory
or
it can be row with the key "index.html" in the siteA-table in your database.
so the part siteA/index.html is just a resource identifier. the grammar of this resource identifier is completely free and is determined per website.
url rewriting is also common to make url easier to read and remember.
for example there could be a rewrite rule to accomplish the following:
if the user enters something like
www.example.com/download/demo.zip
rewrite it so your website sees it like:
www.example.com/download.php?file=demo.zip
Wikipedia's servers map the url to the page you want. .html is just a naming convention that, today is mostly historical from the period of static pages when urls actually were names of files on the server. In fact, there may be no file at all, where the server queries the database and a web framework sends out the html on the fly.
Wikipedia is most likely using the Apache module mod_rewrite in order to not have to link paths directly to a file system path.
See: http://en.wikipedia.org/wiki/Rewrite_engine#Web_frameworks
However programming languages can also take control of the incoming URLs and return data depending on the structure of the link according to some set of rules, for example the Django web framework employees a URL dispatcher.
That's because Wikipedia uses MediaWiki's feature of URL shortening.
Actually when you search for a file it really loads a php file. Try searching for a word that doesn't exist, for example "Pazaz". The URL is http://en.wikipedia.org/w/index.php?title=Special%3ASearch&search=pazaz . Notice index.php in the URL.
To tell the truth it's not a MediaWiki feature, it's Apache. For further info http://www.mediawiki.org/wiki/Manual:Short_URL .
URL routing is your answer for example in ASP read below source from
The ASP.NET MVC framework includes a flexible URL routing system that enables you to define URL mapping rules within your applications. The routing system has two main purposes:
Map incoming URLs to the application and route them so that the right Controller and Action method executes to process them
Construct outgoing URLs that can be used to call back to Controllers/Actions (for example: form posts, links, and AJAX calls)
I would suggest that sites like this use some sort of Model View Controller framework similar to Ruby on Rails where the url 'directories' form a part of a request/url route...
In frameworks that are MVC based, the url 'directories' can dictate what View/Controller to utilise as well as what action should be taken with the data.
eg: shop.com/product/carrots
Where product is a view/controller and carrots is the data. The framework then analyses which action/route to take. Default could be viewing the product information and price of the carrot.

web-development: how do you usually handle the "under costruction" page"?

I was wondering what's the best way to switch a website to a temporary "under costruction" page and switch it back to the new version.
For example, in a website, my customer decided to switch from Joomla to Drupal and I had to create a subfolder for the new CMS, and then move all the content to the root folder.
1) Moving all the content back to the root folder always create some problems with file permissions, links, etc...
2) Creating a rewrite rule in .htaccess or forward with php is not a solution because another url is shown including the top folder.
3) Many host services do not allow to change the root directory, so this is not an option since I don't have access to apache config file.
Thanks
Update: I can maybe forward only the domain (i.e. www.example.com) and leave the ip on the root folder (i.e. 123.24.214.22), so the access is finally different for me and other people? Can I do this in .htaccess file ?
One thing to consider is you don't want search engines to cache your under construction page - and you also don't want them to drop your homepage from the search index either (Hence just adding a "noindex" meta tag isn't the perfect solution).
A good way to deal with this is do a 302 redirect (temporarily moved) from your homepage to your under construction page - that way the search engine does not cache your homepage as an under construction page, does not index your under construction page (assuming it has a NOINDEX meta tag), and does not drop your homepage from the search index either.
One way would be the use of an include on your template page.
When you want the construction page to show, you set a redirect in the include to take all traffic to the construction page.
When you are done your remove the redirect.
What about hijacking your index.php file?
Something simple, along the lines of
<?php
if (SITE_OFFLINE)
include 'under_construction.html';
else
//normal content of your index page
?>
where you would naturally define SITE_OFFLINE in an appropriate place for your needs.
What I did when I used PHP for websites was to configure Apache to direct all requests to a front controller. You then would have full access to all requests no matter where they are pointing to. Then in your front controller (PHP file, static html file, etc.), you would do whatever you need to do there.
I believe you need to configure pathinfo in Apache and some other settings, it has been about 3 years since I have used that approach. But, this approach is also good for developing your own CMS or application so that you have full control over security.
You have to do something similar to this:
http://www.phpwact.org/pattern/front_controller
I am looking for more details, I know my configuration had more to it than that.
This is part of what I'm looking for too:
http://httpd.apache.org/docs/2.0/mod/core.html
Enabling path_info passes path information to the script, so all requests now go through a single point of entry. Let me find my configuration, I know vaguely how this works, but I'm sure it looks like a lot of hand waving.
Also, keep in mind that because all requests are going through this single PHP file, you are responsible for serving images, JavaScript, CSS, etc. So, if a requests is coming in for /css/default.css, that will go through your php script (index.php, most likely), then you'll need to determine how to handle the request. Serving static files is trivial, but it is a little more work.
If you don't want to go that route, you could possibly do something with mod_rewrite so that it only looks for .html, .htm pages or however you have your site configured. For me, I don't do extensions, so that made my regex a little more difficult. I also wanted to secure access to all files. The path_info was the solution for me, but if you don't need that granularity, then writing a front controller might be a bit too much work.
Walter