unknown chars %3cb added while creating html file - html

I'm trying to update a html file. which has following tags
The tune2fs allows you to convert an ext2 file system to
The content is available on online-html editor,when copy-paste(ctrl/c + ctrl+v) the content and add it file, it becomes
he+%3Cb%3Etune2fs%3C%2Fb%3E+allows+you+to+convert+an+%3Cb%3Eext2%3C%2Fb%3E+file+system
does copy/paste adds some chars? Why this happens and is it possible to prevent this?
this is not programming question, but something like copy/paste buffer or creates this issue?

That is HTML that has been URL encoded into <b> and </b> tags. You can change "%3Cb%3E" to "<b>" and "%3C%2Fb%3E" to "</b>". Also note that + is the URL encoded form of " " (space). Alternatively, you could plug it into any URLDecoder and it will decode it for you.

Related

Text to HTML conversion in Node Js

I am using nodemailer to send mail from node server. I am getting the content for this mail from MSSQL SQL server which is formatted in plain text format, which meansa there are new line characters in it, however when I send it using nodemailer the newline characters are missing and the whole text looks messed up. The other way is to insert html tags for line break in the plain text and send this works fine. But there is too much mannual work involved what I am looking is for a library or utility which can convert the plain text into the html which I can send using mail.
Is there any liberary for this requirement or a way to do this automatically?
The following will wrap all parts that are separated by more than one newline in paragraphs (<p>...</p>) and insert breaks (<br>) where there is just one newline. A text block without any newlines will simply be wrapped in a paragraph.
template = '<p>' + template.replace(/\n{2,}/g, '</p><p>').replace(/\n/g, '<br>') + '</p>';
So for example, it will take this:
Title
First line.
Second line.
Footer
And convert it to this:
<p>Title</p><p>First line.<br>Second line.</p><p>Footer</p>
The simplest solution is you can replace the new line characters with <br>.
Try
text.split('\n').join('\n<br>\n')
then you are done.
Ok finally this code snippet worked for me -
template = template.replace(/\n/gi, "</p></br/>")
template = template.replace(/<\/p>/gi, "</p><p></br/>")
This was a lot of hit and trial but eventually it worked.

How to add a line feed in html without using <br> or other tags

When processing a html form, I need to append some text to a text field before saving it to database. To make it prettier, I need to add a line feed in between:
userText = userText + "<br>" + appendedText;
# save userText in database
The problem is,when fetching the text to render web page, for protection agains XSS, I need to escape text from database before rendering. Thus, <br> in userText is rendered as <br> instead of a line feed.
So I am wondering if there is any other way to produce a line feed other than <br>?
I have tried "\n" "\r\n", and "
", none of them work.
Also, the appended text is in the same element with original text, so css with 'display:block' is out of the question.
use \n and then after fetching the data from the database and prior to outputting it, replace all the \n with <br />.
This way you are still safe for XSS, and you have full control over the output.
If you are using javaScript to get the form values you can easily use "\n" to add a new line to the value you want to save;
var toSave = FormValue + "\n\n" + "extratext";
Demo here

Avoid interpreting HTML code in a QTextBrowser

I have a QTextBrowser in my Qt application. I would like to append some text but, I need part of this text not to be interpreted in HTML. How can I achieve this? May I encode the QString?
If you want your browser not to interpret only parts of your text as HTML you will need to quote the part you want to omit (replace "<" with "&l t;" etc.). You can use convenient escape method:
textBrowser->insertHtml(
QString("<b>this will be bold</b>") +
Qt::escape(QString("<b>this will not</b>"))
);
If you would like not to interpret the whole thing you can insert it as plain text:
textBrowser->insertPlainText ( "<b>foobar</b>" );
Finally I solved my own question using:
QString codedHtml = Qt::escape(html);

How to stop an html TEXTAREA from decoding html entities

I have a strange problem:
In the database, I have a literal ampersand lt semicolon:
<div
whenever its printed into a html textarea tag, the source code of the page shows the > as >.
How do I stop this decoding?
You can't stop entities being decoded in a textarea since the content of a textarea is not (unlike a script or style element) intrinsic CDATA, even though error recovery may sometimes give the impression that it is.
The definition of the textarea element is:
<!ELEMENT TEXTAREA - - (#PCDATA) -- multi-line text field -->
i.e. it contains PCDATA which is described as:
Document text (indicated by the SGML construct "#PCDATA"). Text may contain character references. Recall that these begin with & and end with a semicolon (e.g., Hergé's adventures of Tintin contains the character entity reference for the e acute character).
This means that when you type (the invalid HTML of) "start of tag" (<) the browser corrects it to "less than sign" (<) but when you type "start of entity" (&), which is allowed, no error correction takes place.
You need to write what you mean. If you want to include some HTML as data then you must convert any character with special meaning to its respective character reference.
If the data is:
<div
Then the HTML must be:
<textarea>&lt;div</textarea>
You can use the standard functions for converting this (e.g. PHP's htmlspecialchars or Perl's HTML::Entities module).
NB 1: If you were using XHTML[2] (and really using it, it doesn't count if you serve it as text/html) then you could use an explicit CDATA block:
<textarea><![CDATA[<div]]></textarea>
NB 2: Or if browsers implemented HTML 4 correctly
Ok , but the question is . why it decodes them anyway ? assuming i've added & , save the textarea , ti will be saved < , but displayed as < , saving it again will convert it back to < (but it will remain < in the database) , saving again will save it a < in the database , why the textarea decodes it ?
The server sends (to the browser) data encoded as HTML.
The browser sends (to the server) data encoded as application/x-www-form-urlencoded (or multipart/form-data).
Since the browser is not sending the data as HTML, the characters are not represented as HTML entities.
If you take the data received from the client and then put it into an HTML document, then you must encode it as HTML first.
In PHP, this can be done using htmlentities(). Example below.
<?php
$content = "This string contains the TM symbol: ™";
print "<textarea>". htmlentities($content) ."</textarea>";
?>
Without htmlentities(), the textarea would interpret and display the TM symbol (™) instead of "™".
http://php.net/manual/en/function.htmlentities.php
You have to be sure that this is rendered to the browser:
<textarea name="somename">&lt;div</textarea>
Essentially, this means that the & in < has to be html encoded to &. How to do it will depend on the technologies you're using.
UPDATE: Think about it like this. If you want to display <div> inside a textarea, you'll have to encode <> because otherwise, <div> would be a normal HTML element to the browser:
<textarea name="somename"><div></textarea>
Having said this, if you want to display <div> inside a textarea, you'll have to encode & again, because the browser decodes HTML entities when rendering HTML. It has nothing to do with your database.
You can serve your DB-content from a separate page and then place it in the textarea using a Javascript (jQuery) Ajax-call:
request = $.ajax
({
type: "GET",
url: "url-with-the-troubled-content.php",
success: function(data)
{
document.getElementById('id-of-text-area').value = data;
}
});
Explained at
http://www.endtask.net/how-to-prevent-a-textarea-element-from-decoding-html-entities/
I had the same problem and I just made two replacements on the text to show from the database before letting it into the text area:
myString = Replace(myString, "&", "&")
myString = Replace(myString, "<", "<")
Replace n:o 1 to trick the textarea to show the codes.
replace n:o 2: Without this replacement you can not show the word "" inside the textarea (it would end the textarea tag).
(Asp / vbscript code above, translate to a replace method of your language choice)
I found an alternative solution for reading and working with in-browser, simply read the element's text() using jQuery, it returns the characters as display characters and allows me to write from a textarea to a div's innerHTML using the property via html()...
With only JS and HTML...
...to answer the actual question, with a bare-minimal example:
<textarea id=myta></textarea>
<script id=mytext type=text/plain>
™
</script>
<script> myta.value = mytext.innerText; </script>
Explanation:
Script tags do not render html nor entities. By storing text in a script tag, it will remain unadultered-- problem is it will try to execute as JavaScript. So we use an empty textarea and store the text in a script tag (here, the first one).
To prevent that, we change the mime-type to text/plain instead of it's default, which is text/javascript. This will prevent it from running.
Then to populate the textarea, we copy the script tag's content to it (here done in the second script tag).
The only caveats I have found with this are you have to use JavaScript and you cannot include script tags directly in it.

When should I HTML-escape data and when should I URL-escape data?

When should I HTML-escape data in my code and when should I URL-escape? I am confused about which one when to use...
For example, given a element which asks for an URL:
<input type="text" value="DATA" name="URL">
Should I HTML-Escape DATA here or URL-escape it here?
And what about an element:
NAME
Should URL be URL-escaped or HTML-escaped? What about NAME?
Thanks, Boda Cydo.
URL encoding ensures that special characters such as ? and & don't cause the URL to be misinterpreted on the receiving end. In practice, this means you'll need to URL encode any dynamic query string values that have a chance of containing such characters.
HTML encoding ensures that special characters such as > and " don't cause the browser the misinterpret the markup. Therefore you need to HTML encode any values outputted into the markup that might contain such characters.
So in your example:
DATA needs to be HTML encoded.
Any dynamic segments of URL will need to be URL encoded, then the whole string will need to be HTML encoded.
Name needs to be HTML encoded.
HTML Escape when you're writing anything to a HTML document.
URL Escape when you're constructing a URL to call in-code, or for a browser to call (i.e. in the href tag).
In your examples you'll want to 'Attribute' escape the attributes. (I can't remember the exact function name, but it's in HttpUtility).
In the examples you show, it should be first URL-escaped, then HTML-escaped:
<a href="http://www.example.com?arg1=this%2C+that&arg2=blah">