Problems with textarea, within a textarea - html

I'm making a cms system, where you can edit the web pages that is stored in the database(html) on the website. i'm using textarea to display the html code like this.
<textarea><?php echo($pagecontent); ?></textarea>
The problem is that, on one of the pages, there's a textarea to, and for some reason it just cuts of the rest of the code after the starts.
Here's a picture, so you might understand me better: http://i.imgur.com/QueZNEV.png
Been looking around, but i can't seem to find a solution for my problem.

Your textarea contains characters which have special meaning in HTML (<, &, etc) but you want them to be treated as data, not as special characters.
Run them through htmlspecialchars() before echoing them.

Related

Store arbitrary characters in Semantic MediaWiki

I'm trying to store some text containing html tags into properties, which doesn't work. I created a form for a property with the data type 'text' and a template. Saving the form writes the text into the template, but it can't get displayed, as it contains illegal characters, as I guess.
What I'm trying to do:
I need a form to enter data, containing html tags and special
characters
I'd like to be able to use a query to find all those pages
and show that text using a template I provide to the ask query.
I also tried to use the free text option, but then I can't retrieve it using the ask query.
What would be the best, or at least a working solution to this?
Thanks a lot
storing text with html tags is a bit tricky in SemanticMediaWiki
The reason is the invention of the StripMarkers UNIQ/QINU by the MediaWiki developers.
When parsing the content of page with html tags in it the parsing is sort of "postponed". This technical detail unfortunately makes it hard for extension developers like the SMW developers to solve the issue of handling such content. Also it makes it hard for lay people to follow the discussion on how to solve the problem
Here are two examples of SMW Issues that are marked as "closed". This state of affairs means that by following the configuration hints in the issue your problem should be solved. If not please ask a question on the SMW issue list or even initiate the reopening of the issues.
https://github.com/SemanticMediaWiki/SemanticMediaWiki/pull/794
https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues/3707
On my wiki we ran into this and resolved it by replacing special characters (we had issues with [ ] =, but the same problem happens with to < > tags too) with alternate unicode characters using the regex extension and a template before setting the property with {{#set:}}. If you want to display the formatted text on the wiki directly then call that parameter separately without replacing the unicode characters.
When you want to display the property, you can then run the reverse replacement with regex before displaying your now intact code (using the template result format to allow you to perform the operation on the output of the query).
To switch to special characters you can create this template
{{#regex:{{#regex:{{#regex:{{#regex:{{#regex:{{{1|}}}|/=/|꞊}}|/\[/|[}}|/\]/|]}}|/>/|≽}}|/</|≼}}
And to switch back you can use this as a template
{{#regex:{{#regex:{{#regex:{{#regex:{{#regex:{{{1|}}}|/꞊/|=}}|/[/|[}}|/]/|]}}|/≽/|>}}|/≼/|<}}

How to insert enter within the text of html code?

I am writing up a resume, and the enter, space, and other characters do not show up. It's a poorly coded website. Is there a way to include enter key without accessing the source of the html code, or including html elements? If I insert < br > element, it shows up as & lt; br &gt ;
&#10
Chava G's suggestion might work, but it really shouldn't (it would indicate a serious XSS vulnerability).
You can try something similar though, with an no-break-space unicode character.
This is exactly the same thing that he is suggesting with except I am suggesting you use the actual character itself.
I think you can just copy it from the wiki page, right between the quotes where it says non-breaking space (" ").
https://en.wikipedia.org/wiki/Non-breaking_space
This route is a bit iffy as well though, sometimes unicode explodes.
You may be better of just accepting the limitations.
HTML ignores extra white-space. Use <br /> in place of Enter, and for the space character.
You can also use the <pre> tag. Any text inside this tag will show up as formatted, including spaces and the enter key.
However, since you are using a third-party website here, Wedstrom is right that the above shouldn't work. There is no way for you to change or add HTML code on another party's site, and there shouldn't be. Try to find a way to write up your resume using the functionality that is readily provided for you (or use your own text editor...).

Allow whitespace and line-breaks in HTML input field?

In our internal CRM we have a simple html input textarea where you can leave notes and messages. We later use this information to email this, only since that email is in HTML the formating is all wrong.
So if for example I have the following in my MYSQL table:
This is a test message!
Some line
Some more lines
If we later email this it comes out as:
This is a test message! Some line Some more lines
This is obviously not wanted but I don't want to add some complicated WYSIWYG editor to our CRM. Can I allow line-breaks? If so, how?
I don't want to use <pre></pre> tags because I believe it is not supported in all email clients (I could be wrong).
You could use text/plain header, if you don't intend on using any HTML tags in the message. (That would mean no colors, no links, and no text formatting).
You could also make a quick and dirty solution to replace all \ns in your text to <br>\n.
The problem is that html renders all whitespace as single spaces. If you look at the source of the email once it's received, I'll bet the newlines will be there (if they're not, then the problem is on the email generation side).
<pre></pre> is the simplest thing you can do, I think.
A basic solution would be to replace new lines with <br>s.
A smarter one would give special consideration to multiple line breaks (e.g. treating /\n\s*\n/ as a point to end a paragraph and start a new one (</p><p>)).
The specifics would depend on the language you are using to generate the email from the MySQL data. You might want to consider something like a Markdown parser.
You can send emails in two flavours: html and plain text. In Html, line-breaks are not processed (just like in your browser). Looks like this is what you are doing here.
Two solutions: either you send emails in plain text, or you change line-breaks to <br>.
Assuming PHP is in the mix, there's the nl2br() function. Otherwise, rolling your own won't be hard.
The root of this issue is that browsers (mail clients can either use embedded browsers for rendering - Outlook for example - or behave like browsers) will take any amount of whitespace/new line/carriage returns/etc outside of tags in HTML and render them as a single whitespace. This is so you can do things like indent your markup and still have it look sane in the browser.
You will have to insert markup in order to control the rendering as has been has suggested: convert newlines to <br> or <p> tags and so on, much like cms WYSIWYG editors do. Either that or chose a different format for your emails.

Text style affecting the whole site

I've got an input so the user can type either html or plain text. When the user copy & paste text from MS Word, for example, it generates a weird html. Then, when you view that topic, you can see the whole page's style is affected. I don't really know if the generated html has unclosed tags or something, but it looks like it does and thus, the style of the page is affected.
Does anybody know how to "isolate" the html of that div(or whatever the container be) from the whole page's style?
Short of showing the content in an IFRAME, you can't really do that. What I usually do in this situation is apply tag stripping logic to the content as it comes in. You really don't want to allow arbitrary HTML from a security perspective, but even if you don't care what your users input, you should be stripping out invalid HTML tags (Word has a habit of creating tags with weird namespace-looking things like o:p) and running something like Tidy over the result to ensure every tag is properly closed. There are a number of Tidy libraries for .NET out there; here's one.
Here's a quick cut-and-paste of how I've done this in the past. Note that the class implements an interface from the project I used it in, but you get the general idea.
Copying text from word can include <style> tags. The only sure way to isolate these styles is to put the input control in an <iframe>
You can either sanitize the input or display it in an IFrame.
It it were me I'd strip all but basic formatting (e.g., bold, italics) and use Tidy. That's what I end up doing, I strip and convert all the CSS styles of word into <strong>, <em>, etc.

Shortened HTML text and malformed tags

In my web application I intend to shorten a lengthy string of HTML formatted text if it is more than 300 characters long and then display the 300 characters and a Read More link on the page.
The issue I came across is when the 300 character limit is reached inside an HTML tag, example: (look for HERE)
<a hreHERE="somewhere">link</a>
<a hre="somewhere">liHEREnk</a>
When this happens, the entire page could become ill-formatted because everything after the HERE in the previous example is removed and the HTML tag is kept open.
I thinking of using CSS to hide any overflow beyond a certain limit and create the "Read More" link if the text is beyond a certain number, but this would entail me including all the text on the page.
I've also thought about splitting the text at . to ensure that it's split at the end of a sentence, but that would mean I would include more characters than I needed.
Is there a better way to accomplish this?
Note: I have not specified a server side language because this is more of a general question, but I'm using ASP.NET/C# .
Extract the plaintext from the HTML, and display that. There are libraries (like the HTML Agility Pack for .NET) that make this easy, and it's not too hard to do it yourself with an XML parser. Trying to fix a truncated HTML snippet is a losing cause.
One option I can think of is to cut it off at 300 characters and make sure the last index of '<' is less than the last index of '>'. If it is, truncate the string right before the last instance of '>', then use a library like tidy html to fix tags that are orphaned (like the </a> in the example).
There are problems with this though. One thing being if there are 300 chars worth of nothing but HTML - your summary will be displayed as empty.
If you do not need the html to be displayed it's far easier to simply extract the plain text and use that instead.
EDIT: Added using something like tidy html for orphaned tags. Original answer only solved cutting thing mid-tag, rather than within an opening/closing tag.