how to enter < br > from database - html

I'm entering text in my html from a database. Usually when I enter <br> it works fine but for some reason I'm getting lt;brgt; when I try to do this in a section using fullcalendar. I also tried \n but that didn't work either. Is there a way to add the < br > to the database to get it to work

On version 5.5.1 you can use this
eventContent: function(eventInfo) {
return { html: eventInfo.event.title }
}
if your title has a that will be taken care of.

Databases usually (not sure which DB you are using) do not automatically convert or escape special characters.
So I guess somewhere in your code you are encoding the special characters before inserting them into the DB. So now you have to decode those after retrieving them from the DB.
If you are using Java then you can use StringEscapeUtils's escapeHtml4 unescapeHtml4 functions to encode and decode.
If you are using PHP then you can use htmlspecialchars and htmlspecialchars_decode to do the same.

For fullcalendar, you will have to use the eventRender to do so. Try this hack
eventRender: function (event, element) {
element.find('.fc-title').html(event.title);
}

Related

how to decode UTF-8 to HTML tags

I have an HTML document saved in my database as follow:
\\u003cp style=\\\"text-align: center; opacity: 1;\\\"\\u003e\\u003cstrong\\u003e\\u003cspan style=\\\"font-size: 18pt;\\\
I know, it is ugly and I know, it is not the desired way but this is a legacy system.
My task is to get all these HTMLs and convert them to a document in Google Docs. Actually, Google Docs can parse HTML to their internal format pretty good but the HTML needs to be a valid HTML, with <p> instead of \\u003cp.
I'm trying to convert/decode/parse/whatever this string to a valid HTML but so far, without any luck.
Things I already tried
htmlentities gem, CGI decode, Nokogiri::HTML.parse, JSON.parse and none of them did the job.
I also tried string.encode(xxxx) but also without luck. I was really hoping that .encode method would do it but I couldn't make it work, maybe I'm using the wrong encoding? (I tried use all of ISO-xxx encodings)
Here's a quick workaround for you:
input_string.gsub(/\\u(\h{4})/) { [$1.to_i(16)].pack('U') }
With the example input you gave above, this results in:
"<p style=\\\"text-align: center; opacity: 1;\\\"><strong><span style=\\\"font-size: 18pt;\\"
Explanation:
\u003c == <. The left hand side is an escaped unicode character; this is not the same thing as \\u003c, which is a literal backslash followed by u003c.
The regular expression \\u(\h{4}) will match any occurrences of this (\h stands for "hexadecimal" and is equivalent to [0-9a-fA-F]), and Array#pack converts the binary sequence into (in this case) a UTF-8 character.
Ideally of course, you'd solve the problem at its root rather than retro-fit a workaround like this. But if that's outside of your control, then a workaround will have to suffice.
Using Array#pack:
string = "\\u003cp style=\\\"text-align: center; opacity: 1;\\\"\\u003e\\u003cstrong\\u003e\\u003cspan style=\\\"font-size: 18pt;\\"
string.gsub(/\\u(....)/) { [$1.hex].pack("U") }
# => "<p style=\\\"text-align: center; opacity: 1;\\\"><strong><span style=\\\"font-size: 18pt;\\"

Groovy can not encodeAsHTML()

I'm using grails 2.4.3. When I input data to textarea, with new line and save to db. After that, I load it to the view, replace new line by break. However, it shows <br> instead of break.
Below is code I used:
${book.introduction.encodeAsHtml().replace(/\r\n|\r|\n/g,"<br />")}
As long as you trust the output, there are multiple ways of doing this:
As another answer suggests:
${raw(book.introduction.encodeAsHtml().replace(/\r\n|\r|\n/g,"<br />"))}
With the encodeAsRaw() method:
${book.introduction.encodeAsHtml().replace(/\r\n|\r|\n/g,"<br />").encodeAsRaw()}
Using Grails' taglibs:
<g:encodeAs code="Raw">${book.introduction.encodeAsHtml().replace(/\r\n|\r|\n/g,"<br />")}</g:encodeAs>
<g:encodeAs code="None">${book.introduction.encodeAsHtml().replace(/\r\n|\r|\n/g,"<br />")}</g:encodeAs>
The new XSS prevention by default encodes all ${} strings as HTML, so your end product is getting encoded.
You can wrap the whole output in raw to avoid this:
${raw( book.introduction.encodeAsHtml().replace(/\r\n|\r|\n/g,"<br />") )}
See http://grails.org/doc/2.4.3/guide/single.html#xssPrevention for more details. It's worth thinking about what you're rendering and whether XSS could be an issue for you in that place.

Trouble with html encoding in Google Apps Script

I need to convert the HTML entity characters to their unicode versions. For example, when I have &amp, I would like just &. Is there a special function for this or do I have to use the function replace() for each couple of HTML Entity character <--> Unicode character?
Thanks in advance.
Even though there's no DOM in Apps Script, you can parse out HTML and get the plain text this way:
function getTextFromHtml(html) {
return getTextFromNode(Xml.parse(html, true).getElement());
}
function getTextFromNode(x) {
switch(x.toString()) {
case 'XmlText': return x.toXmlString();
case 'XmlElement': return x.getNodes().map(getTextFromNode).join('');
default: return '';
}
}
calling
getTextFromHtml("hello <div>foo</div>& world <br /><div>bar</div>!");
will return
"hello foo& world bar!".
To explain, Xml.parse with the second param as "true" parses the document as an HTML page. We then walk the document (which will be patched up with missing HTML and BODY elements, etc. and turned into a valid XHTML page), turning text nodes into text and expanding all other nodes.
In Javascript, (I assume that's what you're using), there's no builtin function, but you can assign the content to an html tag and then read the text out. Here's an example with jQuery:
function htmlDecode(value){
return $('<div/>').html(value).text();
}
Note that the tag does not need to actually be attached to the DOM. This just creates a new tag, reads out its contents, and then throws it away. You can accomplish something very similar in vanilla Javascript with just a few extra lines.

Encode HTML before POST

I have the following script, which encodes some of the value it receives propertly, but it does not seem to encode double quotes.
How do I encode the full value properly before posting?
function htmlEncode(value){
return $('<div/>').text(value).html();
}
The above script give me this:
<p>Test&nbsp; <span style="color: #ffffff"><strong><span style="background-color: #ff0000">1+1+1=3</span></strong></span></p>
I need it to give me this:
<p>Test&nbsp; <span style="color: #ffffff"><strong><span style="background-color: #ff0000">1+1+1=3</span></strong></span></p>
EDIT: Followup question:
Encoded HTML in database back to page
You shouldn't try to encode things with JavaScript.
You should encode it serverside.
Anything that can be done with JavaScript can be undone.
It is valid to encode it in JavaScript if you also check that it was encoded on the server, but keep in mind: JavaScript can be disabled.
What George says is true.
But, if you have to encode strings client-side, I'd suggest you use JavaScript's encodeURIComponent().
I had a similar problem. I simply used the replace method in javascript. Here's a nice article to read: http://www.w3schools.com/jsref/jsref_replace.asp
Basically what the replace method does is it swaps or replaces the character it founds with what you indicate as replacement character(s).
So this:
var str=' " That " ';
str = str.replace(/"/g,'"');
Once you log this into the console of your browser, you will get something like
" That "
And this:
var str=' " That " ';
str = str.replace(/"/g,'blahblahblah');
Once you log this into the console of your browser, you will get something like
blahblahblah That blahblahblah
You can use this module in js, without requiring jQuery:
htmlencode
You can re-use functions from php.js project - htmlentities and get_html_translation_table
Use escape(str) at client side
and
HttpUtility.UrlDecode(str, System.Text.Encoding.Default); at server side
it worked for me.

Regex to Parse Hyperlinks and Descriptions

C#: What is a good Regex to parse hyperlinks and their description?
Please consider case insensitivity, white-space and use of single quotes (instead of double quotes) around the HREF tag.
Please also consider obtaining hyperlinks which have other tags within the <a> tags such as <b> and <i>.
­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­
As long as there are no nested tags (and no line breaks), the following variant works well:
<a\s+href=(?:"([^"]+)"|'([^']+)').*?>(.*?)</a>
As soon as nested tags come into play, regular expressions are unfit for parsing. However, you can still use them by applying more advanced features of modern interpreters (depending on your regex machine). E.g. .NET regular expressions use a stack; I found this:
(?:<a.*?href=[""'](?<url>.*?)[""'].*?>)(?<name>(?><a[^<]*>(?<DEPTH>)|</a>(?<-DEPTH>)|.)+)(?(DEPTH)(?!))(?:</a>)
Source: http://weblogs.asp.net/scottcate/archive/2004/12/13/281955.aspx
See this example from StackOverflow: Regular expression for parsing links from a webpage?
Using The HTML Agility Pack you can parse the html, and extract details using the semantics of the HTML, instead of a broken regex.
I found this but apparently these guys had some problems with it.
Edit: (It works!)
I have now done my own testing and found that it works, I don't know C# so I can't give you a C# answer but I do know PHP and here's the matches array I got back from running it on this:
Text
array(3) { [0]=> string(52) "Text" [1]=> string(15) "pages/index.php" [2]=> string(4) "Text" }
I have a regex that handles most cases, though I believe it does match HTML within a multiline comment.
It's written using the .NET syntax, but should be easily translatable.
Just going to throw this snippet out there now that I have it working..this is a less greedy version of one suggested earlier. The original wouldnt work if the input had multiple hyperlinks. This code below will allow you to loop through all the hyperlinks:
static Regex rHref = new Regex(#"<a.*?href=[""'](?<url>[^""^']+[.]*?)[""'].*?>(?<keywords>[^<]+[.]*?)</a>", RegexOptions.IgnoreCase | RegexOptions.Compiled);
public void ParseHyperlinks(string html)
{
MatchCollection mcHref = rHref.Matches(html);
foreach (Match m in mcHref)
AddKeywordLink(m.Groups["keywords"].Value, m.Groups["url"].Value);
}
Here is a regular expression that will match the balanced tags.
(?:""'[""'].*?>)(?(?>(?)|(?<-DEPTH>)|.)+)(?(DEPTH)(?!))(?:)