html purifier - allow links? - html

Old problem with links and html purfier.
I'm using this code:
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.Allowed', 'p,b,a[href],i');
$config->set('HTML.AllowedAttributes', 'a.href');
$purifier = new HTMLPurifier($config);
But it doesn't work.
The input
search
will be turned to
search
Any ideas?

Turn off magic quotes. http://php.net/manual/en/security.magicquotes.disabling.php

Related

QToolButton doesn't support HTML

I want to use HTML format in QToolButton. for example in this picture , I should create QToolButton in "Sara" and "Online".
Here is my code:
viewControl=new QToolButton(this);
QString labelText = "<P><b><i><FONT COLOR='#fff'>";
labelText .append("Sara");
labelText .append("</i></b></P></br>");
labelText .append("online");
viewControl->setText(labelText);
But it seems QToolButton cannot define HTML format.
How to resolve it?
I also used layout in QToolButton but it show me empty box.
QVBoxLayout *titleLayout = new QVBoxLayout();
QLabel *nameLabel = new QLabel("Name");
QLabel *onlineLabel = new QLabel ("online");
titleLayout->addWidget(nameLabel);
titleLayout->addWidget(onlineLabel);
viewControl->setLayout(titleLayout);
According to the answer mentioned here
I don't think this is possible without subclassing QToolButton and overriding the paintEvent. but you can try something like this:
toolButton->setStyleSheet("font-weight: Italic");

DOMDocument issues: Escaping attributes and removing tags from javascript

I am not fan of DOMDocument because I believe it is not very good for real world usages. Yet in current project I need to replace all texts in a page (which I don't have access to source code) with other strings (some sort of translation); so I need to use it.
I tried doing this with DOMDocument and I didn't received the expected result. Here is the code I use:
function Translate_DoHTML($body, $replaceArray){
if ($replaceArray && is_array($replaceArray) && count($replaceArray) > 0){
$body2 = mb_convert_encoding($body, 'HTML-ENTITIES', "UTF-8");
$doc = new DOMDocument();
$doc->resolveExternals = false;
$doc->substituteEntities = false;
$doc->strictErrorChecking = false;
if (#$doc->loadHTML($body2)){
Translate_DoHTML_Process($doc, $replaceArray);
$body = $doc->saveHTML();
}
}
return $body;
}
function Translate_DoHTML_Process($node, $replaceRules){
if($node->hasChildNodes()) {
$nodes = array();
foreach ($node->childNodes as $childNode)
$nodes[] = $childNode;
foreach ($nodes as $childNode)
if ($childNode instanceof DOMText) {
if (trim($childNode->wholeText)){
$text = str_ireplace(array_keys($replaceRules), array_values($replaceRules), $childNode->wholeText);
$node->replaceChild(new DOMText($text),$childNode);
}
}else
Translate_DoHTML_Process($childNode, $replaceRules);
}
}
And here are the problems:
Escaping attributes: There are data-X attributes in file that become escaped. This is not a major problem but it would be great if I could disable this behavior.
Before DOM:
data-link-content=" <a class="submenuitem" href=&quot
After DOM:
data-link-content=' <a class="submenuitem" href="
Removing of closing tags in javascript:
This is actually the main problem for me here. I don't know for what reason in the world DOMDocument may see any need to remove these tags. But it do. As you can clearly see in below example it remove closing tags in java-script string. It also removed last part of script. It seems like DOMDocument parse the java-script inside. Maybe because there is no CDATA tag? But any way it is HTML and we don't need CDDATA in HTML. I thought CDATA is for xHTML. Also I have no way to add CDDATA here. So can I ask it to not parse script tags?
Before DOM:
<script type="text/javascript"> document.write('<video src="http://x.webm"><p>You will need to Install the latest Flash plugin to view this page properly.</p></video>'); </script>
After DOM:
<script type="text/javascript"> document.write('<video src="http://x.webm"><p>You will need to <a href="http://www.adobe.com/go/getflashplayer" target="_blank">Install the latest Flash plugin to view this page properly.</script>
If there is no way for me to prevent these things, is there any way that I can port this code to SimpleHTMLDOM?
Thanks you very much.
Try this , and replace line content ;
$body2 = mb_convert_encoding($body, 'HTML-ENTITIES', "UTF-8");
to ;
$body2 = convertor($body);
and insert in your code ;
function convertor($ToConvert)
{
$FromConvert = html_entity_decode($ToConvert,ENT_QUOTES,'ISO-8859-1');
$Convert = mb_convert_encoding($FromConvert, "ISO-8859-1", "UTF-8");
return ltrim($Convert);
}
But use the right encoding in the context.
Have a nice day.
Based on my search, reason of the second problem is actually what "Alex" told us in this question: DOM parser that allows HTML5-style </ in <script> tag
But based on their research there is no good parser out there capable of understanding today's HTML. Also, html5lib's last update was 2 years ago and it failed to work in real world situations based on my tests.
So I had only one way to solve the second problem. RegEx. Here is the code I use:
function Translate_DoHTML_GetScripts($body){
$res = array();
if (preg_match_all('/<script\b[^>]*>([\s\S]*?)<\/script>/m', $body, $matches) && is_array($matches) && isset($matches[0])){
foreach ($matches[0] as $key => $match)
$res["<!-- __SCRIPTBUGFIXER_PLACEHOLDER".$key."__ -->"] = $match;
$body = str_ireplace(array_values($res), array_keys($res), $body);
}
return array('Body' => $body, 'Scripts' => $res);
}
function Translate_DoHTML_SetScripts($body, $scripts){
return str_ireplace(array_keys($scripts), array_values($scripts), $body);
}
Using above two functions I will remove any script from HTML so I can use DomDocument to do my works. Then again at the end, I will add them back exactly where they were.
Yet I am not sure if regex is fast enough for this.
And don't tell me to not use RegEx for HTML. I know that HTML is not a regular language and so on; but if you read the problem your self, you will suggest the same approach.

AJAX HTMLEditorExtender on postback tables don't display

I am currently using an Ajax tool; HTMLEditorExtender to turn a textbox into a WYSIWYG editor, in a C# ASP.NET project. On the initial page load I place a large amount of formated text and tables into the editor which appears fine; even the tables.
The data is loaded into an asp:panel and the items/display from the panel is what is actually loaded into the extender and displayed.
However, if I want to have a button that saves all of the data that is in the editor to a Session and after the button press still display everything in the WYSIWG editor on the page postback everything that loads in the the textbox is fine except for the tables. They come up with the tags. Is there anyway around this?
The code I am using to initially load the page is this:
ContentPlaceHolder cphMain = (ContentPlaceHolder)this.Master.FindControl("MainContent");
Panel pnlContent = (Panel)cphMain.FindControl("innerFrame");
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
HtmlTextWriter hw = new HtmlTextWriter(sw);
pnlContent.RenderControl(hw);
txtPN.Text = sb.ToString();
pnlContent.Visible = false;
On the button click I am having this saved:
string strHTMLText = txtPN.Text;
Session["ProgressNoteHTML"] = strHTMLText;
And I am loading it on the postback like this:
txtPN.Text = (string)Session["ProgressNoteHTML"];
ContentPlaceHolder cphMain = (ContentPlaceHolder)this.Master.FindControl("MainContent");
Panel pnlContent = (Panel)cphMain.FindControl("innerFrame");
pnlContent.Visible = false;
Any ideas as to why any postbacks would make the tags appear and in the original page load they do not?
The solution offered by Erik won't work for table tags containing property values. For instance: <table align="right"> will not be decoded. I have also found that <img> tags are encoded by the HTMLEditorExtender as well.
The easier solution is to use the Server.HTMLDecode() method.
TextBox_Editor.Text = Server.HtmlDecode(TextBox_Editor.Text) 'fixes encoding bug in ajax:HTMLEditor
I have the same problem, It seems to have something to do with the default sanitizing that the extension performs on the HTML content. I haven't found a way to switch it off, but the workaround is pretty simple.
Write an Anti-Sanitizing function that replaces the cleansed tags with proper tags. Below is mine written in VB.Net. A C# version would look very similar:
Protected Function FixTableTags(ByVal input As String) As String
'find all the matching cleansed tags and replace them with correct tags.
Dim output As String = input
'replace Cleansed table tags.
output = output.Replace("<table>", "<table>")
output = output.Replace("</table>", "</table>")
output = output.Replace("<tbody>", "<tbody>")
output = output.Replace("</tbody>", "</tbody>")
output = output.Replace("<tr>", "<tr>")
output = output.Replace("<td>", "<td>")
output = output.Replace("</td>", "</td>")
output = output.Replace("</tr>", "</tr>")
Return output
End Function

Adobe TLF and HTML

What is the best way to convert a tlf markup to HTML? I want only standar HTML without the old font tag. I think I saw a utility created by someone for this, but I can remember where it is. any ideas?
Tks.
http://help.adobe.com/en_US/ActionScript/3.0_ProgrammingAS3_Flex/WSc3ff6d0ea7785946579a18b01205e1c5646-7fef.html
var ptext:String = "Hello, World";
var flow:TextFlow = TextConverter.importToFlow(ptext, TextConverter.PLAIN_TEXT_FORMAT);
var out:XML = TextConverter.export(flow, TextConverter.TEXT_LAYOUT_FORMAT, ConversionType.XML_TYPE );
but use TextConverter.TEXT_FIELD_HTML_FORMAT instead of TextConverter.PLAIN_TEXT_FORMAT

Sending values through links

Here is the situation: I have 2 pages.
What I want is to have a number of text links(<a href="">) on page 1 all directing to page 2, but I want each link to send a different value.
On page 2 I want to show that value like this:
Hello you clicked {value}
Another point to take into account is that I can't use any php in this situation, just html.
Can you use any scripting? Something like Javascript. If you can, then pass the values along in the query string (just add a "?ValueName=Value") to the end of your links. Then on the target page retrieve the query string value. The following site shows how to parse it out: Parsing the Query String.
Here's the Javascript code you would need:
var qs = new Querystring();
var v1 = qs.get("ValueName")
From there you should be able to work with the passed value.
Javascript can get it. Say, you're trying to get the querystring value from this url: http://foo.com/default.html?foo=bar
var tabvalue = getQueryVariable("foo");
function getQueryVariable(variable)
{
var query = window.location.search.substring(1);
var vars = query.split("&");
for (var i=0;i<vars.length;i++)
{
var pair = vars[i].split("=");
if (pair[0] == variable)
{
return pair[1];
}
}
}
** Not 100% certain if my JS code here is correct, as I didn't test it.
You might be able to accomplish this using HTML Anchors.
http://www.w3schools.com/HTML/html_links.asp
Append your data to the HREF tag of your links ad use javascript on second page to parse the URL and display wathever you want
http://java-programming.suite101.com/article.cfm/how_to_get_url_parts_in_javascript
It's not clean, but it should work.
Use document.location.search and split()
http://www.example.com/example.html?argument=value
var queryString = document.location.search();
var parts = queryString.split('=');
document.write(parts[0]); // The argument name
document.write(parts[1]); // The value
Hope it helps
Well this is pretty basic with javascript, but if you want more of this and more advanced stuff you should really look into php for instance. Using php it's easy to get variables from one page to another, here's an example:
the url:
localhost/index.php?myvar=Hello World
You can then access myvar in index.php using this bit of code:
$myvar =$_GET['myvar'];
Ok thanks for all your replies, i'll take a look if i can find a way to use the scripts.
It's really annoying since i have to work around a CMS, because in the CMS, all pages are created with a Wysiwyg editor which tend to filter out unrecognized tags/scripts.
Edit: Ok it seems that the damn wysiwyg editor only recognizes html tags... (as expected)
Using php
<?
$passthis = "See you on the other side";
echo '<form action="whereyouwantittogo.php" target="_blank" method="post">'.
'<input type="text" name="passthis1" value="'.
$passthis .' " /> '.
'<button type="Submit" value="Submit" >Submit</button>'.
'</form>';
?>
The script for the page you would like to pass the info to:
<?
$thispassed = $_POST['passthis1'];
echo '<textarea>'. $thispassed .'</textarea>';
echo $thispassed;
?>
Use this two codes on seperate pages with the latter at whereyouwantittogo.php and you should be in business.