HTML_purifier stripping display:none css from images, even with CSS.AllowTricky set to True? - html

That title is probably a bit confusing so let me elaborate.
I'm using HTML_purifier to clean up user input, although in this case the only user who will be using it will be myself (its in password protected folders). A long story short I would like to be able to add in image tag code to a web form, then on the page that it sends too use the code to display said image.
However i need the image tag to have css attributes added to it, one of which is
display:block
Anyway by default HTML_purifier removes this, detailed here because of the CSS.allowTricky option. As i understand it if you set the CSS.allowTricky option to True, then it should allow
display:block
However after doing this its still removing it, just wondering if anybody has done this before as i can't find much documentation about it on the web? Its not generating any errors in syslog, so im assuming that its the correct implementation but isn't working as expected.
My code at the moment.
include('HTMLPurifier.standalone.php');
$config = HTMLPurifier_Config::createDefault();
$config->set('CSS.AllowTricky', true);
* UPDATE **
The code should pass the config object (which the code already set) to the html purifier object. Putting it together it should look something like this.
include('HTMLPurifier.standalone.php');
$config = HTMLPurifier_Config::createDefault();
$config->set('CSS.AllowTricky', true);
$purifier = new HTMLPurifier($config);

Duplicate of http://htmlpurifier.org/phorum/read.php?3,6724 (solution was passing the config object to the HTML Purifier object so that the config actually got applied.)

Related

Jsoup - hidden div class?

Im trying to scrape a div class but everything I have tried has failed so far :(
Im trying to scrape the element(s):
<a href="http://www.bellator.com/events/d306b5/bellator-newcastle-pitbull-vs-
scope"><div class="s_buttons_button s_buttons_buttonAlt
s_buttons_buttonSlashBack">More info</div></a>
from the website: http://www.bellator.com/events
I tried accessing the list of elements by doing
Elements elements = document.select("div[class=s_container] > li");
but that didnt return anything.
Then i tried accessing just the parent with
Elements elements = document.select("div[class=s_container]");
and that returned two div with classname "s_container", non of which is the one I needed :<
then i tried accessing that ones parent with
Elements elements = document.select("div[class=ent_m152_bellator module
ent_m152_bellator_V1_1_0 ent_m152]");
And that didnt return anything
I also tried
Elements elements = document.select("div[class=ent_m152_bellator]");
because I wasnt sure about the white spaces but it didnt return anything either
Then I tried accessing its parent by
Elements elements = document.select("div#t3_lc");
and that worked, but it returned an element containing
<div id="t3_lc">
<div class="triforce-module" id="t3_lc_promo1"></div>
</div>
which is kinda weird because i cant see that it has that child when i inspect the website in chrome :S
Anyone knows whats going on? I feel kinda lost..
What you see in your web browser is not what Jsoup sees. Disable JavaScript and refresh page to get what Jsoup gets OR press CTRL+U ("Show source", not "Inspect"!) in your browser to see original HTML document before JavaScript modifications. When you use your browser's debugger it shows final document after modifications so it's not not suitable for your needs.
It seems like whole "UPCOMING EVENTS" section is dynamically loaded by JavaScript.
Even more, this section is asynchronously loaded with AJAX. You can use your browsers debugger (Network tab) to see every possible request and response.
I found it but unfortunately all the data you need is returned as JSON so you're going to need another library to parse JSON.
That's not the end of the bad news and this case is more complicated. You could make direct request for the data:
http://www.bellator.com/feeds/ent_m152_bellator/V1_1_0/d10a728c-547e-4a6f-b140-7eecb67cff6b
but the URL seems random and few of these URLs (one per upcoming event?) are included inside JavaScript code in HTML.
My approach would be to get the URLs of these feeds with something like:
List<String> feedUrls = new ArrayList<>();
//select all the scripts
Elements scripts = document.select("script");
for(Element script: scripts){
if(script.text().contains("http://www.bellator.com/feeds/")){
// here use regexp to get all URLs from script.text() and add them to feedUrls
}
}
for(String feedUrl : feedUrls){
// iterate over feed URLs, download each of them
String json = Jsoup.connect(feedUrl).ignoreContentType(true).get().body().toString();
// here use JSON parsing library to get the data you need
}
ALTERNATIVE approach would be to stop using Jsoup because of its limitations and use Selenium Webdriver as it supports dynamic page modifications by JavaScript so you'd get the HTML of the final result - exactly what you see in web browser and Inspector.
If anyone finds this in the future; I managed to solve it with Selenium, dont know if its a good/correct solution but it seems to be working.
System.setProperty("webdriver.chrome.driver", "C:\\Users\\PC\\Desktop\\Chromedriver\\chromedriver.exe");
WebDriver driver = new ChromeDriver();
driver.get("http://www.bellator.com/events");
String html = driver.getPageSource();
Document doc = Jsoup.parse(html);
Elements elements = doc.select("ul.s_layouts_lineListAlt > li > a");
for(Element element : elements) {
System.out.println(element.attr("href"));
}
Output:
http://www.bellator.com/events/d306b5/bellator-newcastle-pitbull-vs-scope
http://www.bellator.com/events/ylcu8d/bellator-215-mitrione-vs-kharitonov
http://www.bellator.com/events/yk2djw/bellator-216-mvp-vs-daley
http://www.bellator.com/events/e8rdqs/bellator-217-gallagher-vs-graham
http://www.bellator.com/events/281wxq/bellator-218-sanchez-vs-grimshaw
http://www.bellator.com/events/8lcbdi/bellator-219-koreshkov-vs-larkin
http://www.bellator.com/events/9rqguc/bellator-macdonald-vs-fitch

Set Dynamic Page Title from Widget Block in Magento 2

I created a custom widget in Magento 2 and would like to dynamically change the page title in the Block's code. However, I can't seem to overwrite the page title that gets set in Magento\Cms\Block\Page::_prepareLayout().
From inside my block I have the following function. As you can see I've tried a handful of ways to make the changes to see if I can get any result.
public function _prepareLayout()
{
$this->pageConfig->getTitle()->set('TESTING');
$this->pageConfig->getTitle()->append(' - TEST');
$this->pageConfig->setMetadata('title', 'Meta Title Test');
$this->pageConfig->addBodyClass('mytestclass');
$pageMainTitle = $this->getLayout()->getBlock('page.main.title');
if ($pageMainTitle) {
$pageMainTitle->setPageTitle('Page Heading Title');
}
return parent::_prepareLayout();
}
I ran the page load through xDebug and according to the debugger my _prepareLayout does get called after the default _prepareLayout that originally sets the title. And looking through the variables in the debugger the above code all appears to be successful. However, when the page actually renders none of the above is implemented. Not even the body tag added.
I don't see much out there in regards to trying to do this from a widget block. I'm wondering if there is something I'm missing here, or if I'm approaching this the wrong way. Maybe there's a better way to achieve what I want.
Yes, I cleared, flushed, and disabled all of the cache.
Using Magento version 2.1.2
Any guidance would be greatly appreciated!

How to Include "onclick" Object in WordPress HTML

I'm using attempting to add an "onclick" object to a page in a singlesite (i.e. rather than multisite) WordPress that triggers an event. The code is:
Send a voice message
When attempting to save the code, WordPress strips the onclick object leaving:
Send a voice message
A user on another forum suggested that this restriction should only apply to multisite non-superadmin users. Again, this is a siglesite with only one admin user.
It is understood that WordPress removes "onclick" from HTML to prevent malicious code. Still, does anyone know how to resolve this?
Thanks.
It appears that with current Wordpress (I'm on 4.9.4), TinyMCE does the filtering directly on the editor screen, not when the form is submitted. The allowedtags and allowedposttags don't seem to matter, so the solution above does not solve the problem for me.
The method I have developed uses the tiny_mce_before_init filter to alter the allowed tags within TinyMCE. The trick is to add the extended_valid_elements setting with the updated versions of the elements allowed for a.
First, look in the page http://archive.tinymce.com/wiki.php/Configuration3x:valid_elements to find the current value for a, which right now is
a[rel|rev|charset|hreflang|tabindex|accesskey|type|name|href|target|title|class|onfocus|onblur]
And add to the end of that the onclick attribute:
a[rel|rev|charset|hreflang|tabindex|accesskey|type|name|href|target|title|class|onfocus|onblur|onclick]
Then use that in the filter function like this:
function allow_button_onclick_mce($settings) {
$settings['extended_valid_elements'] = "a[rel|rev|charset|hreflang|tabindex|accesskey|type|name|href|target|title|class|onfocus|onblur|onclick]";
return $settings;
}
add_filter('tiny_mce_before_init', 'allow_button_onclick_mce');
which you install in your functions.php file in Wordpress. You can see it in action by toggling the text and visual view on the edit page. Without the extended list, the onclick goes away. With it, it remains.
You can solve this by changing the anchor tag into button and adding a script. For more info please refer to this link: Wordpress TinyMCE Strips OnClick & OnChange (need jQuery).
By resolving, I'm assuming you mean to allow the onclick attribute. You will want to be careful with this, because modifying the allowed tags does this for all your users.
You can modify the list of allowed tags and attributes, by adding this to your functions.php file:
function allow_onclick_content() {
global $allowedposttags, $allowedtags;
$newattribute = "onclick";
$allowedposttags["a"][$newattribute] = true;
$allowedtags["a"][$newattribute] = true; //unnecessary?
}
add_action( 'init', 'allow_onclick_content' );
I suggest trying it with only $allowedposttags first to see if that works for you. According to this other stackexchange post, you should only need allowedtags if you need it for comments or possibly non-logged-in users, but when I did something similar in the past, I needed both of them to work.
On a side note, if you want a list of all already allowed tags and attributes, look inside your /wp-includes/kses.php file.

Is it possible to add CSS code to URL?

after one hour of browsing I decided to ask this question here.
Is it possible to add css code to an url, for example to change the background color?
Someting kike this: http://yahoo.com (command)style=background-color:#000000;
or similar. Or is it possible to create an url where the site loads with a modified css without using a Chrome extension or similar?
Thanks for help!
No. You can't (using standard software) modify a document by adding anything to that document's URL (unless the server recognises the addition to the URL (e.g. if it was a query string) and returns a different document based on it).
If it was possible then browsers would be exposing every site to XSS attacks.
A browser extension would be the only way to do this client side (but would render users of that extension vulnerable to XSS attacks).
You could also use a bookmarklet in a two stage approach (1. Visit page. 2. Click to activate bookmarket.).
it's possible in a way, but probably not how you imagined it (see Quentin's answer to understand why).
with javascript - note that this is not a 'native' feature so you will have to do a little walk-around. look at the following example:
function get_query_param(name) {
name = name.replace(/[\[]/, "\\\[").replace(/[\]]/, "\\\]");
var regex = new RegExp("[\\?&]" + name + "=([^&#]*)"),
results = regex.exec(location.search);
return results == null ? "" : decodeURIComponent(results[1].replace(/\+/g, " "));
}
window.onload = function() {
var bgcolor = get_query_param('bgcolor');
if (bgcolor.length) {
document.getElementById("xyz").style["padding-top"] = "10px";
document.body.style.backgroundColor = bgcolor;
}
}
now try browsing your page with ?bgcolor=red at the end of the url.
of course that's a demonstration of the main idea, you will have to implement each css property you wish to modify using this approach.
hope that helps.
Yes it is possible. Follow this:
Yahoo
is it possible to create an url where the site loads with a modified css
Solution:
Add something like this : ?v=1.1
<link rel="stylesheet" href="style.css?v=1.1">
When you change the css change the version like this: ?v=1.2 after then your browser will load newly updated css. Note that you can replace to any number each time you change the css.
This will have no effect on the CSS. It will only serve to make the browser think it’s a completely different file.
If you don’t change the value for a while, the browser will continue to cache (or preserve) the file, and won’t attempt to download it unless other factors force it to, or you end up updating the query string value.

How can I make hyperlinks open in a new tab using CSS or Multimarkdown?

I am using Text::MultiMarkdown to create HTML files from MultiMarkdown documents.
I would like all links to open in a new tab.
Is there a way to configure this behavior using a CSS template, or directly in the MultiMarkdown document (without explicitly writing HTML around each link in the MultiMarkdown document)?
Definitely not in CSS - that is only concerned with the way the elements appear, not how they behave.
It should be possible to add <base target="_blank"> to the head of the HTML document (using XSLT), but that's on par with adding it to each link.
In HTML and/or JavaScript you can only initialize the opening of a new window. The user is in some UAs able to force the opening of a new window as a new tab instead. But you can not control this behaviour.
In theory, you could do this with CSS3: http://www.w3.org/TR/css3-hyperlinks/ - however no common browser ever implemented this. The reason might be that it is a common believe that the choice of when a new window or tab is opened should be left to the user alone.
You can't do this in CSS but you can use the source.
You could subclass Text::MultiMarkdown and provide your own implementation of _GenerateAnchor, something similar to this might work:
sub _GenerateAnchor {
my ($self, $whole_match, $link_text, $link_id, $url, $title, $attributes) = #_;
if($url
&& index($url, '#') != 0) {
$attributes = $attributes ? $attributes . ' target="_blank"' : 'target="_blank"';
}
return $self->SUPER::_GenerateAnchor($whole_match, $link_text, $link_id, $url, $title, $attributes);
}
This is a bit kludgey as _GenerateAnchor isn't part of the public interface. You'd also need to use the OO interface rather than just the markdown function.
You could also contact the Text::MultiMarkdown author and see if he'll add a flag for this sort of thing. Maybe you could provide a patch to get things started.
You can also use HTML::Parser and friends to parse the HTML that comes out of Text::MultiMarkdown and add the target attributes yourself.