Escape HTML as well as \r \t and \n from a string - html

I am trying to index solar search from a built string in a code which has HTML tags. Any one knows how I can remove all the characters from the String.
Currently, I am using
answers << answer.feedback.replaceAll('\\<.*?>','')
I want to escape all the HTML characters and all the \n \t and \r. How to do this?

Do you want to escape the html tags so that <span> becomes <span> or do you want to REMOVE the tags themselves. Your original question is ambiguous.
For the first scenario:
answer.feedback.encodeAsHTML()
(see http://grails.org/doc/latest/ref/Plug-ins/codecs.html for further info)

Related

how to search and delete a series of attributs in html using vim?

i need help with vim and regex.
i have a html file with a lot of class="..."
e.g.
<td class="td_3"><p class="block_3">80 €</p></td>
<td class="td_3"><p class="block_3">90 €</p></td>
Since i'm not using any css, i want to delete them.
i tried:
:%s/class="[a-z0-9]"//g
but it's not working.
What am i doing wrong?
With class="[a-z0-9]" pattern, you match a single alphanumeric char in between quotes, while there may be any text other than double quotation mark.
You probably also want to remove the whitespaces before the class.
You may use
:%s/\s\+class="[^"]*"//g
Here, \s\+ will match one or more whitespace chars, class=" matches a literal string, then [^"]* finds any zero or more chars other than " as many as possible and then a " matches the closing double quotation mark.

How do I type html in a markdown file without it rendering?

I want to type the following sentence in a markdown file: she says <h1> is large. I can do it in StackOverflow with three backticks around h1, but this doesn't work for a .md file. I've also tried a single backtick, single quote, double quote, hashtags, spacing, <code>h1</code> and everything else I could think of. Is there a way to do this?
You can escape the < characters by replacing them with <, which is the HTML escape sequence for <. You're sentence would then be:
she says <h1> is large
As a side note, the original Markdown "spec" has the following to say:
However, inside Markdown code spans and blocks, angle brackets and ampersands are always encoded automatically. This makes it easy to use Markdown to write about HTML code. (As opposed to raw HTML, which is a terrible format for writing about HTML syntax, because every single < and & in your example code needs to be escaped.)
...which means that, if you're still getting tags when putting them in backticks, whatever renderer you're using isn't "compliant" (to the extent that one can be compliant with that document), and you might want to file a bug.
Generally, you can surround the code in single backticks to automatically escape the characters. Otherwise just use the HTML escapes for < <and > >.
i.e.
she says <h1> is large or she says `<h1>` is large
A backslash (\) can be used to escape < and >.
Ex: she says <h1> is large
P.S. See this answer's source by clicking Edit.

How to find words enclosed in regex?

I have the follow wing text on my html pages and I need to replace some words. For example:
"./images/delete.html"
"http.google.com"
this is a text
I have to retrieve the string with the following criteria:
words enclosed in a ""
words with ./images/ in any position (or after opening ")
words ending with .html
In my example, only the "./images/delete.html" should be returned.
Can someone please help me. Thanks!
You could try the below regex whch uses a negated character class [^"] matches any char but not of " zero or more times.
"[^"]*\.\/images\/[^"]*\.html"
DEMO

regex newline character error

I am trying to make my regex work across multiple lines and "m" didn't seem to work either. So, my regex is working for 1st line and noT for the following lines.
You can skip the match part and just do it all in one step:
> "the *text* is to be replaced \n by *text*".replace(/\*([\s\S]*?)\*/g, '<i>$1</i>');
"the <i>text</i> is to be replaced \n by <i>text</i>"
. matches any character, but it excludes newlines. [\s\S] matches any character including newlines.
I changed your search regex to \*([\s\S]*?)\*, which non-greedily matches the stuff between the asterisks.
The replacement string is <i>$1</i>. $1 is replaced with the contents of the first capturing group, which is your text.
Also, because it looks like you're trying to convert Markdown to HTML, try using a pre-made JS converter: http://www.showdown.im/
You can use it like this:
var str = "the *text* is to be *replaced \n by* *text*";
alert(str.replace(/\*([\s\S]*?)\*/g, '<i>$1</i>'));

Rendering newlines in escaped html

We have a display message that is auto-generated by default, but may be overridden by GET vars from the url. Since we need to treat the message as user input, it must be escaped for display. However, we want to retain the ability to include newlines.
Newlines as <br>
This won't work because escaping HTML destroys the <br> tag.
Newlines as \n
I can't figure out how to get \n to render as newlines. I thought putting it in a tag would render correctly, but no luck: http://jsfiddle.net/7L932/
Escape the HTML, and then replace \n with <br>.
In case you want to use \n, I fix your fiddle for you http://jsfiddle.net/hr3bg/
Setting wrapper html css to white-space: pre-line did the trick for me. It enables \n character's new line feature
What you're doing is more or less fine, except for you should put \n character (newline), not the escape sequence in your html (and what Prinzhorn says also makes perfect sense, so I'll go upvote him).
Your theory's sound, but \n is a not an HTML-recognised way of inserting a new line. It either comes in explicitly (as I've inserted in a new .linebreaks element) as a literal return in the markup, or, if you're using some intermediary scripting language that does recognise \n (like JS), do that (as I've done to your first .linebreaks with the jQuery code I inserted.
See my tweak to your example: http://jsfiddle.net/barney/7L932/2/