I have a MySQL database on a WordPress site where I want to find all the posts in a certain table that contain a URL (within a specific subdirectory) that doesn't have a trailing slash and add one.
For example, find URLs like:
https://www.example.com/directory/test
And change them to:
https://www.example.com/directory/test/
Some already have a trailing slash, so I don't want to add// on the end of those.
The directory they are in is constant so the URL will always contain /directory/*
Any ideas on what regular expression I should use? I am using the Better Search and Replace Plugin
https://deliciousbrains.com/wp-migrate-db-pro/doc/find-and-replace/#regex-find-replace
You can try this for not capturing the string without "/" if it's present
/(?>\/$)/
Related
I have many pages (posts) in a WP site with legacy short code that needs to be removed and cannot accomplish what I need using WordPress find/replace plug-ins - so I'm turning to MySQL queries.
The shortcodes to be removed all follow the same pattern: "[nivo... (variable content) .../nivo]
I need to remove the entire nivo shortcode and ONLY the nivo shortcode because other shortcodes exist on some pages.
I found something very close to what I think I need, and modified the obvious parameters for this particular application as follows...
UPDATE `post_content`
REPLACE(txt, SUBSTRING(txt, LOCATE('[nivo', txt), LENGTH(txt) - LOCATE('nivo]', REVERSE(txt)) - LOCATE('nivo]', txt) + 10), '')
WHERE txt LIKE '%(%)%'
That ^^^ is accepted in the SQL query window (no stop signs) but returns a #1064 error when executed.
Ideally I would like to TEST this first on a specific post ID just to be sure it's really catching everything but I couldn't figure out how to write that into the query.
I know VERY little about MySQL (I'm a designer) but I have DB backups ready for rollback just in case.
Help would be greatly appreciated.
The key is that you want to use a plugin or tool that allows you to use regex (regular expressions). I have used this Php tool, you just put it in your public_html and then access the path via a web browser: https://interconnectit.com/search-and-replace-for-wordpress-databases/
Just use a little regex in the Php tool or find a find and replace plugin that allows you to use regex. It would look something like this:
\[nivo.*?/nivo\]
\ means escaping a character, since the bracket has a special meaning in regex.
. means any character
* means any number of characters
? means 0 or more
.*? together means anything or nothing
For reference (and probably a better answer):
PHP - Remove Shortcodes and Content in between with Regex Pattern
Using MySQLAdmin. Moved data from Windows server and trying to replace case in urls but not finding the matches. Need slashes as I don't want to replace text in anything but the urls (in post table). I think the %20 are the problem somwhow?
UPDATE table_name SET field = replace(field, '/user%20name/', '/User%20Name/')
The actual string is more like:
https://www.example.com/forum/uploads/user%20name/GFCI%20Stds%20Rev%202006%20.pdf
In a case you are using MariaDB you have REGEXP_REPLACE() function.
But best approach is to dump the table into the file. Open it in a Notepad ++
and run regex replace like specified on a pic:
Pattern is: (https:[\/\w\s\.]+uploads/)(\w+)\%20(\w+)((\/.*)+)
Replace with: $1\u$2\%20\u$3$4
Then import the table again
Hope this help
If its MariaDB, you can do the following:
UPDATE table_name SET field = REGEXP_REPLACE(field, '\/user%20name\/', '\/User%20Name\/');
First, please check, what is actually stored in the database: %20 is a html-entity which represents a whitespace. Usually, when you are storing this inside the database, it will be represented as an actual whitespace (converted before you store it) -> Hence your replace doesn't match the actual data.
The second option that might be possible - depending on what you want to do: You are seeing the URL containing %20, therefore you created your database records (which you would like to fetch) with that additional %20 - And when you now try to query your results based on the actual url, the %20 is replaced with an "actual" whitespace (before your query) and hence it doesn't match your stored data.
In the pages I've checked, they all return the same thing, but the Mediawiki documentation says there are differences.
I'm not worried about the differences, but which one is actually stored in the page table?
Neither of them. The internal representation ("DB key form") is title without namespace (it's stored separately as a number in page_namespace), spaces replaced with underscores. The code is here. Thus it's neither {{PAGENAME}} which is human-readable title, nor {{PAGENAMEE}} which is {{#urlencode:{{PAGENAME}}}} with special case for spaces -> underscores.
Got it. I saved the page "Texas A & M" and in the page table it shows as "Texas_A_&_M".
According to Mediawiki's Manual:PAGENAMEE_encoding page (I can't post more than two links), PAGENAME is the only one that will convert an ampersand to & while the others convert it to %26.
The following is still not correct!
I thought it was PAGENAME, but PAGENAME actually doesn't replace the spaces with underscores.
Instead, I found here and here that you can access the string that is stored in the Page table by using this:
$dbk = $title->getDBkey();
That snippet is pulled straight from Mediawiki code.
It doesn't appear there is a Magic Word associated with this key.
I can't find where the page_title in the database comes from, but it looks like it's simply the page name with the spaces, quotes, and ampersand replaced. Maybe it's database dependent. I'm using MySQL.
Table: pages
Field: url
Issue: http://google.com//search?q=something
I have a few thousand rows where this has happened I would like a query that removes the following in bold if that is possible..
http://google.com//<--- remove the extra forward slash
http:// <--- Not to be touched
If anyone knows a MySQL query for this that would be great!
Thanks
You can use the REPLACE function. It requires three parameters
first is the source string,
second is the string to be searched for
and third is the string to replace
UPDATE pages SET url=REPLACE(url, '.com//', '.com/') WHERE url LIKE '%.com//%'
There are some links in my database that do not have a trailing slash, and for consistency sake, I want all links to have one.
All the links are in this form href="http://mysite.com/page/item/"
Now there are some links that look like this href="http://mysite.com/page/item" and href="http://mysite.com/page/item".
Now I can not find out on what page they are, but they are somewhere in the db, can I use phpmyadmin and regex to find them?
If so, can anyone help me with setting up the regex code, I still can not wrap my head around regex.
You should be able to find all entries with the REGEXP operation, e.g.
SELECT * FROM the_table WHERE href REGEXP '[^/]$'
or
SELECT * FROM the_table WHERE href NOT REGEXP '/$'
[^/]: Match any character which is not a slash; $: Match end of string.