How to get post slug from Drupal database - mysql

I'm trying to export Drupal posts over to Wordpress (which is in itself a hassle). I can't figure out how to maintain the URLs of the blog posts though. Some of them are customized:
Blog titled Story of Soil is blog/2012/03/03/soil-story in Drupal. One titled Welcome John Doe is simply /john
Is there a Drupal function to making these URLs? Where does it store the customized blog posts?

You can get the URL Alias by using the url method.
$url = url('node/' . $nid);

You should be able to get the Alias for a node by using drupal_lookup_path:
// alias: return an alias for a given Drupal system path (if one exists).
$alias = drupal_lookup_path('alias', $node->nid);
Drupal manual: drupal_lookup_path or the reverse, look up node/internal path from alias: drupal_get_normal_path.
It seems the url function function that Rawkode posted does about the same, so I guess it comes down to your personal preference.
Also see: http://daipratt.co.uk/how-to-get-the-path-of-a-node-from-the-node-id-in-drupal/

I found the above tremendously helpful because i was hoping to get what I needed from the db when I was looking and not by using a function. it gave me a good starting point but running this as it sits does nothing but error out because those are not the right column names. Keep in mind this updates stuff in place and will break everything if you are not working on a copy of the database. This code will fix it:
UPDATE `url_alias`
SET source=SUBSTRING_INDEX(`source`, '/', -1),
alias=SUBSTRING_INDEX(`alias`, '/', -1)
WHERE 1
since im not big on doing things in place I just added two columns nid and slug and then ran this query
UPDATE `url_alias`
SET nid=SUBSTRING_INDEX(`source`, '/', -1),
slug=SUBSTRING_INDEX(`alias`, '/', -1)
WHERE 1

I found the url_alias table in the Drupal database, and I ran this SQL statement:
UPDATE `url_alias`
SET src=SUBSTRING_INDEX(`src`, '/', -1),
dst=SUBSTRING_INDEX(`dst`, '/', -1)
WHERE 1
src is now the nid and dst is now the slug. I can rename them and then INSERT INTO wp_posts as post_name

Related

MySQL SUBSTR LOCATE multi-search-strings

Tricky one, and my brain is mush after staring at my screen for about an hour.
I'm trying to query my database to return the first part of a string (domain name eg. http://www.example.com) in the column image_link.
I have managed this for all rows where the image_link contains .com as part of the string... but I need the code to be more versatile, so it searches for the likes of .net and .co.uk too.
Had thought some sort of nested REPLACE might work, but it doesn't make sense when I try to apply it - and I'm stuck.
Query Builder code:
$builder->select("SUBSTRING(image_link, 1, LOCATE('.com', image_link) + 3) AS domain");
Example strings, with desired results:
http://www.example.com/brands/567.jpg // http://www.example.com
https://www.example.org/photo.png // https://www.example.org
http://example.net/789 // http://example.net
Any help/advice warmly welcomed!
SELECT ... ,
SUBSTRING_INDEX(image_link, '/', 3) domain
FROM test;
Or, if protocol may be absent, then
SELECT ... ,
SUBSTRING_INDEX(image_link, '/', CASE WHEN LOCATE('//', image_link) THEN 3 ELSE 1 END) domain
FROM test;
fiddle

SQL Query replace string with placeholder

My table has lots of entries (image paths). Such a string is an URL and looks like that
https://example.com/images/do.php?id=1234
And i have to change them all to this format
https://example.com/images/1234.png
So the "ID" is equal to the filename. Replacing just the URL isn't that hard, but i have to add the static file extension, which is in my case "png" at the end of the URL string. So i need something like that
UPDATE post SET message = REPLACE(message, 'https://example.com/images/do.php?id=', 'https://example.com/images/{id}.png');
I'm absolutely no experienced SQL user, so can you please help me out?
Edit//
Now i have entries like that:
https://example.com/images/1234
https://example.com/images/5678
What is the query that i need to add the static file extension? So my entries looks like that:
https://example.com/images/1234.png
https://example.com/images/5678.png
The length of the ID's are between 4 to 6 characters. The main problem here why i can't just add the extension is, because my message row does contain more text than just the URL to modify. Such a row can look like that:
Here is your image link: [LINK]https://example.com/images/1234.png[/LINK] You can view it now.
Edit2//
My DB table named "post" does look as follows
id | message
----------------
1 | test
2 | Here is your image link: [LINK]https://example.com/images/1234[/LINK] You can view it now.
3 | some strings
4 | Here is your image link: [LINK]https://example.com/images/5678[/LINK] You can view it now.
5 | [LINK]no correct url[/LINK]
6 | [LINK][IMG]https://example.com/images/9123[/IMG][/LINK]
7 | [LINK]https://example.com/images/912364[/LINK]
So not every message row does contain an url and not every message with a [LINK]-tag does contain a proper url. Also there are enrties which have a longer ID, they should not be changed.
This answers the original version of the question.
This works for the example in your question:
select concat(substring_index(path, 'do.php', 1),
substring_index(path, '=', -1),
'.png')
from (select 'https://example.com/images/do.php?id=1234' as path) x
You can easily turn this to an update:
update post
set message = concat(substring_index(message, 'do.php', 1),
substring_index(message, '=', -1),
'.png')
where message like '%do.php%=%';
For your second problem:
UPDATE post
SET message = concat(message,'.png')
Just extend your current entry with '.png' :-)
EDIT: So, my bad! You should give Regex a try:
UPDATE post
SET message = CONCAT(REGEXP_SUBSTR(path,'.*https:\/\/example\.com\/images\/[0-9]{4,6}'),'.png',REGEXP_SUBSTR(path,'\[\/LINK\].*'))
This statement asserts that there is no file extension existing. So you can run the statement without where clause because the file extension won't get doubled.
For being sure you can check it before with:
SELECT
message
,CONCAT(REGEXP_SUBSTR(path,'.*https:\/\/example\.com\/images\/[0-9]{4,6}'),'.png',REGEXP_SUBSTR(path,'\[\/LINK\].*')) as corr_message
FROM path
Check the pattern as well on e.g. Regex101.com with your given example as string
Here is your image link: [LINK]https://example.com/images/1234[/LINK] You can view it now.
and the following as pattern
(.*https:\/\/example\.com\/images\/[0-9]{4,6})(\[\/LINK\].*)
The parenthesis builds groups. The first group is for the first part of the string. In update, this is the first Part of Concat. Afterwards we will set '.png'. The third part withing your update statement is represent with the second group of the regex-pattern.
Hope, this will help you. :)
EDIT2: Alright, this will fit to your rows.
UPDATE post
SET message = CONCAT(REGEXP_SUBSTR(path,'.*https:\/\/example\.com\/images\/[0-9]{4,6}'),'.png',REGEXP_SUBSTR(path,'\[\/(IMG|LINK)\].*'))
WHERE message LIKE '%https://example.com/images/%'
Pattern for checking it:
(.*https:\/\/example\.com\/images\/[0-9]{4,6})(\[\/(IMG|LINK)\].*)
As select for checking your data before update:
SELECT
message
,CONCAT(REGEXP_SUBSTR(path,'.*https:\/\/example\.com\/images\/[0-9]{4,6}'),'.png',REGEXP_SUBSTR(path,'\[\/(IMG|LINK)\].*')) as corr_message
FROM path
WHERE message LIKE '%https://example.com/images/%'
Hope this fits now for your case :)

MySQL: Why aren't url's matching when using REPLACE?

My Situation:
I have url's in a field containing blog posts. The url's are being stored in my database with escape characters. My task at the moment is to replace some already inserted 'http' url's with 'https' url's, but REPLACE will match neither the original url nor the escaped url. I can't just replace every instance of 'http:', because I only want to affect certain links in each post, not every link.
I am very familiar with SQL, as well as REPLACE, so I'm not just asking how REPLACE works and how to use it. Another user here has tested my queries in his environment and they work. So, there must be something in my configuration that is preventing the queries from functioning as expected.
I have searched this site and google extensively for several hours and have found nothing specifically addressing my issue. Everything I have tried is included below and if there is something else I should try, I don't know what that is and I haven't found any suggestions/posts/comments that suggest doing anything differently.
Example URL:
http://test01.mysite.com
As Stored in DB:
http:\/\/test01.mysite.com
Code to Re-Create Situation:
DROP TABLE IF EXISTS test_posts;
CREATE TABLE IF NOT EXISTS test_posts (
id int NOT NULL AUTO_INCREMENT,
post_content longtext NOT NULL,
PRIMARY KEY (id)
)
INSERT INTO
test_posts
(post_content)
VALUES
('content content content Link I want to change content content content Link I don\'t want to change content content content Link I want to change content content content Link I don\'t want to change');
If I run
UPDATE
test_posts
SET
post_content = REPLACE(post_content, 'http://test01.mysite.com', 'https://test01.mysite.com');
or
UPDATE
test_posts
SET
post_content = REPLACE(post_content, 'http:\/\/test01.mysite.com', 'https://test01.mysite.com');
zero records are affected.
For testing purposes, I ran the following query which returns 0 rows.
SELECT
*
FROM
test_posts
WHERE
post_content LIKE '%http://test01.mysite.com%'
OR
post_content LIKE '%http:\/\/test01.mysite.com%'
OR
post_content LIKE '%http:\\/\\/test01.mysite.com%'
OR
post_content LIKE 'http:%/%/test01.mysite.com%';
If I run:
SELECT
*
FROM
test_posts
WHERE
post_content LIKE '%http:_/_/test01.mysite.com%'
It does return matches, but that doesn't solve the real problem of how to match when using UPDATE/REPLACE.
I have tried on two different servers and I get the same results on both.
I have tried the following Engine/Collation combinations and all return the same 0 records results:
MyISAM/latin1_swedish_ci
MyISAM/utf8mb4_unicode_ci
InnoDB/latin1_swedish_ci
InnoDB/utf8mb4_unicode_ci
Anybody know how I can write these queries so that REPLACE will find matches to those url's or what settings in my database or PhpMyAdmin may be causing the queries to return/affect 0 rows?
I think the backslash must be escaped in MySQL
field_name LIKE 'http:\\/\\/test01.mysite.com%'
Of course one could go for sure and use the single char wildcard __
field_name LIKE 'http:_/_/test01.mysite.com%'
or for your both cases: an optional backslash:
field_name LIKE 'http:%/%/test01.mysite.com%'
I'm still baffled as to why the queries with LIKE won't work, but, sadly, using those to narrow down the problem clouded my judgement and I didn't try all the same combinations in the REPLACE functions.
The following works:
UPDATE
test_posts
SET
post_content = REPLACE(post_content, 'http:\\/\\/test01.mysite.com', 'https://test01.mysite.com');
If anyone can explain to me why these combinations work with REPLACE, but not with LIKE, I'd really love to know. Thanks!
There is no reason, your query won't work if you have run properly, there is something else, you may be missing here.
UPDATE
test1
SET
name_1 = REPLACE(name_1, 'http:\/\/test01.mysite.com', 'https://test01.mysite.com')
works well and does the job of repalcing the \/ with /.
See screen-shot attached,
You may have some other problem, please check and update the question, if so.
Edit after comments
If you have more data points in URL, change query like below.
UPDATE
test1
SET
name_1 = REPLACE(name_1, '\/', '/')
Above will replace all the occurrence of \/ with /.
As \\ did not work to represent/escape a backslash, use regular expression functions:
REGEXP_LIKE('.*http:\\/\\/test01\.mysite.com.*')
REGEXP_REPLACE(field, 'http:\\/\\/', 'http://')
Here \\ should work.

Need SQL Query Help: How to Search and Replace Specific Text LIKE x AND NOT LIKE xx

and thanks in advance for any help. I'm working on fixing all broken links in a massive WordPress multisite database and need some help writing an SQL query to run via PHP MyAdmin. I've searched, but can't the perfect solution...
PROBLEM: We have more than a thousand broken links that start with http:/ instead of http://
CHALLENGE: The following would result in numerous links starting with http:///
UPDATE wp_1_posts
SET post_content = replace (post_content,
'http:/',
'http://');
PROCESS: I want to write a query to SELECT all these links first, so I can review them to ensure I don't do any damage when replacing the text string. Downloading a db dump and doing a manual S&R is not an option since we're talking about a multi-gigabyte database.
I thought something like this would work...
SELECT * FROM wp_1_posts
WHERE post_content LIKE '%http:/%'
AND WHERE post_content NOT LIKE '%http://%'
But that just throws a syntax error. Am I even close?
QUESTION #1: How can I find all instances of "http:/" without returning all "http://" instances in the query results.
QUESTION #2: How might I safely fix all instances of "http:/" without affecting any "http://" strings.
FYI: I'll admit I know just enough about this to be dangerous, and I am not familiar with regular expressions. at. all. That's why I'm turning to you for help. Thanks again!
This should work, in MYSQL:
UPDATE wp_1_posts SET post_content = replace(post_content,'http:/', 'http://')
WHERE post_content REGEXP 'http:/[^/]'

MySQL Query to retrieve full URL slug

First question!
I have a MySQL table which stores all content on the various pages of a site. Let's say there's three fields, id(int), permalink(varchar) and parent(int)(. parent is the id of it's parent page.
I need a query that will build a full URL of a page, I'm guessing using CONCAT. I have it working fine for two levels, but can't figure out a way to make it scale for multiple levels; /root/level1/level2/ etc.
Here's what I have so far.
SELECT
CONCAT(
(SELECT permalink FROM content WHERE id = 2 LIMIT 1), # id = parent
"/",
(SELECT permalink FROM content WHERE id = 11 LIMIT 1)) as full_url
Any help, greatly appreciated!
That would be a recursive query, you have to use a stored procedure on the server (Which are avaiable in MySql #Claude).
You cannot do recursion in a query, you would have to use stored procedures but this is not available in MySQL.