I use a php image resize script which is invoked using:
<img src="/images/image.php?img=test.png&maxw=100&maxh=100" alt="This is a test image" />
but this does not W3C validate. Are there anyways to get this to validate?
since you havn't given an exact eror-message, i have to assume the validation fails because of the ampersands. just take a look at the error description (wich also should be directly linked to from the validation-report, so you could have easily found this on your own) to see how to solve this.
To avoid problems with both validators and browsers, always use &
in place of & when writing URLs in HTML.
that said, just change your code to:
... src="/images/image.php?img=test.png&maxw=100&maxh=100" ...
It has nothing to do with PHP. All you need to do is turn those & characters into entities:
<img src="/images/image.php?img=test.png&maxw=100&maxh=100" alt="This is a test image" />
Really though, it's not that big of a deal. No browser (that I'm aware of) will misinterpret this, but if you want perfect validation then that's what you need to do.
If you output such URLs from PHP you can use htmlentities() to automatically convert e.g. & to &
htmlentities — Convert all applicable characters to HTML entities
Example:
$path = "/images/image.php?img=test.png&maxw=100&maxh=100";
$path = htmlentities($path);
echo $path;
This would output this in your html:
/images/image.php?img=test.png&maxw=100&maxh=100
Related
I'm using Jsoup's parseBodyFragment() and parse() methods to work with blocks of code made up of script, noscript, and style tags. The goal isn't to clean them - just to select(), analyze, and output them. The select() portion works really well.
However, the issue is that it's automatically encoding the url parameters of src attributes. So, when the input is this:
<noscript>
<img height="1" width="1" style="display:none;" alt="" src="https://something.orother.com/i/cnt?txn_id=123&p_id=123"/>
</noscript>
I end up with this, returned from Jsoup, via the outerHTML() method:
<noscript>
<img height="1" width="1" style="display:none;" alt="" src="https://something.orother.com/i/cnt?txn_id=123&p_id=123"/>
</noscript>
The issue being the standard ampersand (&) in the url parameter is being encoded and output as &. Is there a way to disable this?
I'm looking for a way to get the html of the selected element without modification. Thanks!
Update (2/23/2016): Clarified problem. Also, found an issue on the Github repo describing the problem: https://github.com/jhy/jsoup/issues/372. Looks like this might not be possible.
The original HTML is invalid. An & which doesn't start a character reference must be expressed as & in an HTML attribute value.
HTML parsers are expected to perform error recovery and generate a valid DOM.
Jsoup works by parsing the HTML into a DOM, letting you run queries on it, then exporting the DOM back to HTML afterwards.
You can't avoid white space normalisation, error recovery, or any of the other things that parsers do. The approach used by Jsoup to extract data is not designed to support the preservation of errors.
I am wondering how to get html markup language to be displayed in a web page when using Html Encode which is being used to replace some string like in the example below.
#(Html.Raw(Html.Encode(Model.Test).Replace("\n", "<br />")))
Of course, just using
#(Html.Raw(Model.Name)) e.g.<b>test/b> = test
Will achieve what I am asking for but then I will lose the replace code.
I could do this replacing functionality in the controller which may be the best method. However, I am intrigued to whether this can be done just in the view.
Thanks
You can use
#Html.Raw("<b>test</b>")
for this.
Html.Encode(Model.Test)
actually changes the string <b> to
<b>
so in fact I think this should be enough
#(Html.Raw(Model.Test.Replace("\n", "<br />")))
Facebook like button (XFBML) used this
<fb:like send="true" width="450" show_faces="true"></fb:like>
Clearly the <fb></fb> is a tag, XML will accept it but it's not HTML. So is it normal that the browser keep it in the document?
What kind of programming technique is this called? Is it the right way? Or just another way to create a hidden element and replace the id="fb" ?
What is the :something in <fb:like> stands for? How to access it with javascript?
This is XHP!
XHP is a PHP extension created by Facebook.
It makes PHP understand XML nodes, so you can write something like this (from their own example):
<?php
$href = 'http://www.facebook.com';
echo <a href={$href}>Facebook</a>;
?>
XHP also allows you to create PHP classes, which can be used in your markup. So the <fb:like /> node is actually turned into a PHP class at compile time. The definition of the class probably looks like this:
<?php
class :fb:like extends :x:element {
...
}
You can read more about it in the link to Github above, and on the creators blog which is all about XHP.
So to answer your questions:
will not be processed by the browser, but by XHP. XHP turns it into PHP objects, which lastly turns it into valid HTML tag(s). This is true when using XHP, but it is also possible for us to use the same tag, without XHP. I'm guessing this is just a matter of parsing the tag in javascript and sending the variable to the API, which probably uses API to recreate the structure, and send back the HTML.
Not really a technique, but a unique thing that Facebook has developed to make their lifes working with PHP easier.
Again, when it is returned to the browser, it has been transformed by XHP (after sending it to Facebook through javascript). Try looking at the rendered version - it looks different than the simple <fb:like> tag.
I am creating a block and using the FCKEditor rich text input box. I switch to Source mode and enter in the following HTML:
<img src="http://test.com/image.png" alt="an image" />
I check to confirm that input format is set to "Full HTML" and press Save. Upon loading my site, I discover that the HTML in FCKEditor's Source view is now:
<p><img alt="\"an image" src="\"http://test.com/image.png\"" /></p>
Obviously that prevents the image from rendering properly since the browser sees the path to the image as:
"http://test.com/image.png"
Does someone know how to help?
Quick workaround could be to not use the quotes since it seems to be adding them in anyway.
Example:
<img src=http://site.com/image alt=alt text>
Have you changed or selected a suitable text format?
If you go to admin/config/content/formats, you can update or even create a new text format.
Select the one you're currently using that is resulting in this problem, and check if one of the filters is creating this problem. There are some that can influence or generate the problem you're experiencing.
"Correct faulty and chopped off HTML" filter
"Convert URLs into link" filter
"Limit allowed HTML tags" filter
Also, check in the FCKEditor's config page if any auto-correction filter is activated.
In any case, if the problem is inserting images, I think you should be better off with a dedicated module, like IMCE (http://drupal.org/project/imce).
Hope it helps.
How do I limit the types of HTML that a user can input into a textbox? I'm running a small forum using some custom software that I'm beta testing, but I need to know how to limit the HTML input. Any suggestions?
i'd suggest a slightly alternative approach:
don't filter incoming user data (beyond prevention of sql injection). user data should be kept as pure as possible.
filter all outgoing data from the database, this is where things like tag stripping, etc.. should happen
keeping user data clean allows you more flexibility in how it's displayed. filtering all outgoing data is a good habit to get into (along the never trust data meme).
You didn't state what the forum was built with, but if it's PHP, check out:
http://htmlpurifier.org/
Library Features: Whitelist, Removal, Well-formed, Nesting, Attributes, XSS safe, Standards safe
Once the text is submitted, you could strip any/all tags that don't match your predefined set using a regex in PHP.
It would look something like the following:
find open tag (<)
if contents != allowed tag, remove tag (from <..>)
Parse the input provides and strip out all html tags that don't match exactly the list you are allowing. This can either be a complex regex, or you can do a stateful iteration through the char[] of the input string building the allowed input string and stripping unwanted attributes on tags like img.
Use a different code system (BBCode, Markdown)
Find some code online that already does this, to use as a basis for your implementation. For example Slashcode must perform this, so look for its implementation in the Perl and use the regexes (that I assume are there)
Regardless what you use, be sure to be informed of what kind of HTML content can be dangerous.
e.g. a < script > tag is pretty obvious, but a < style > tag is just as bad in IE, because it can invoke JScript commands.
In fact, any style="..." attribute can invoke script in IE.
< object > would be one more tag to be weary of.
PHP comes with a simple function strip_tag to strip HTML tags. It allows for certain tags to not be stripped.
Example #1 strip_tags() example
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> Other text';
echo strip_tags($text);
echo "\n";
// Allow <p> and <a>
echo strip_tags($text, '<p><a>');
?>
The above example will output:
Test paragraph. Other text
<p>Test paragraph.</p> Other text
Personally for a forum, I would use BBCode or Markdown because the amount of support and features provided such as live preview.