I have some html which contains a number of hyperlinks to html files, but they don't have any file extensions.
For example in the string <a href='variablelengthfilename'> I'm trying to match the trailing ' , so I can replace it with .html' (using a RegEx search in Notepad++) using something like this:
`(?<=href='[A-Za-z]*)'`
but that won't work because Notepad++ doesn't allow variable-length lookbehind assertions.
How else can I achieve this?
Thanks
Since you are working in Notepad++, here is a way to achieve what you are after:
Find what: \bhref='[^']*
Replace with: $&.html
The \bhref='[^']* regex matches a href as a whole word, then =' are matched literally, and [^']* matches 0 or more characters other than '. Note you will need to replace ' with " if the href value is inside double quotes.
Assuming all your links look like that, why not just do a simple replace
'>
with
.html'>
?
Related
I'd like to use Regex to match HTML tag "head" and text inside them so I can delete them easily. I'm using a find and replace tool that is utilizing regex syntax and it really works great in replacing multiple files at once.
I tried doing a lot of syntax but I always fail.
http://regex101.com/r/aZ6pN5/2
Anyone can help please?
Replace .* in your regex with [\S\s]*?, so that it would match line breaks also. You can't use s DOTALL modifier in JavaScript.
<head.*?>([\s\S]*?)<\/head>
[\s\S]*? This would do an non-greedy match of zero or more space or non-space characters.
DEMO
OR
To replace the contents of head tag.
(<head\b[^<>]*>)[\s\S]*?(<\/head>)
Replacement string:
$1stringyouwant$2
DEMO
I'm using Sublime Text and I need to come up with a regex that will find the whitespaces between a certain opening and closing tag and replace them with commas.
Example: Replace white space in
<tags>This is an example</tags>
so it becomes
<tags>This,is,an,example</tags>
Thanks!
You have just to use a simple regex like:
\s+
And replace it with with a comma.
Working demo
This will find instances of
<tags>...</tags>
with whitespace between the tags
(<tags>\S+)\W(.+</tags>)
This will replace the first whitespace with a comma
\1,\2
Open Find and Replace [OS X Cmd+Opt+F :: Windows Ctrl+H]
Use the two values above to find and replace and use the 'Replace All' option. Repeat until all the whitespaces are converted to commas.
The best answer is probably a quick script but this will get you there fairly fast without needing to do any coding.
You can replace any one or more whitespace chunks in between two tags using a single regular expression:
(?s)(?:\G(?!\A)|<tags>(?=.*?</tags>))(?:(?!</?tags>).)*?\K\s+
See the regex demo. Details
(?s) - a DOTALL inline modifier, makes . match line breaks
(?:\G(?!\A)|<tags>(?=.*?</tags>)) - either the end of the previous successful match (\G(?!\A)) or (|) <tags> substring that is immediately followed with any zero or more chars, as few as possible and then </tags> (see (?=.*?</tags>))
(?:(?!</?tags>).)*? - any char that does not start a <tags> or </tags> substrings, zero or more occurrences but as few as possible
\K - match reset operator
\s+ - one or more whitespaces (NOTE: use \s if each whitespace must be replaced).
SublimeText settings:
Just wanted to know if this the right way to write a regular-expression for an opening Html-tag <strong> : /<strong[^>]*/i?
What I am trying to do is have a pattern in place for html tags and then use is to match any html document.
Thanks in advance!
Close.
It would be like this for the opening tag:
/<strong[^>]*?>/i
Keep in mind that using Regex on HTML which involves tags nested within themselves can get very messy.
Ok. What I understood is that You want to match any string between "<" and ">" symbols. for an example <codekaro>
To do so you can use :
^[\<][A-Za-z]*[\>]$
Here, ^ indicates start of an expression,
[\<] will check for one occurrence of < symbol, \ is used as escape character for < symbol
[A-Za-z]* will check for any string,
[>] will check for one occurrence of > symbol, \ is used as escape character for > symbol
$ indicates end of an expression.
I encourage you to use this link for regex tutorial and this link to check results of regular expression.
Hope this will help you..!!
Happy learning..!!
I have a ton of text replacements to make and I would like to try and do this all at once instead of manually. I'm trying to replace <a class='stuff morestuff' href='#'>Some Text</a> with Some Text; essentially stripping off the surrounding anchor tag.
I've been messing around with a search/replace in Visual Studio using regex, but am not really getting anywhere. My latest attempt:
Find what:
\<a class='stuff morestuff' href='#'\>(.+)\<\/a\>
Replace with:
$1
If what I want to do is even feasible, how can I correct my regex to accomplish this?
This regex will match your anchors if the class and href are always the same:
Find: \<a[^\>]class='stuff morestuff' href='\#'[^\>]*\>(.*)\</a\>
Replace: $1
This regex will replace all the anchors with the inner text:
Find: \<a[^\>]*\>(.*)\</a\>
Replace: $1
I'm assuming from your post you plan to use this in Visual Studio's Find/Replace and not in code.
Find:\\<a class='.*?' href='#'>(.*?)\\</a\\>
Replace: $1
I know you can search text in html using wildcards. Can you search for attribute values in html using wildcards with nokogiri
e.g., suppose I want to search for classes with value *session*
You can use xpath contains() function to search the document. Something like:
doc.xpath("//*[#*[contains(., 'session')]]").each do |ele|
# something
end
This search returns all the elements with any attribute whose value contains the string 'session'.
Had a similar problem few days ago - notice spaces around class values.
find(:xpath, "//*[contains(concat(' ', normalize-space(#class), ' '), ' icon-edit ')]")