TinyMCE scrubbing HTML in Umbraco - html

I'm trying to use Bootstrap's collapse functionality in Umbraco, but when I edit the HTML of a page in the rich text editor (TinyMCE), the data- attributes are scrubbed when I save the page so the plugin doesn't work. I've followed Allow any markup in the tinymce editor with no effect. Can I stop TinyMCE scrubbing my HTML?
EDIT: I've reproduced the problem at http://fiddle.tinymce.com/BNcaab
Try pasting the code below into the HTML editor, then saving and clickig the HTML editor again.
<a class="accordion-toggle down" data-toggle="collapse" data-parent="#accordion2" href="#collapseOne">
<h4>Slide 1</h4>
<span class="accordion-arrow"></span>
</a>

Umbraco has TidyHtml run after a save/publish event and unfortunately doesn't 100% sysnc with the tinyMCE valid/invalid_elements. There are a number of html5 elements and attributes that get discarded and i believe some other basic elements like <scripts> (this is for the better, i say!) and <iframes>. I can't remember the exact list of elements that tidy will squash, but this is a problem that we ran into on our latest Umbraco 4.8.11 implementation and unfortunately had to resort to disabling tidy.
Disabling Tidy can be done in the [/config/umbracoSettings.config] with the following:
<!-- clean editor content with use of tidy -->
<TidyEditorContent>False</TidyEditorContent> <!-- gross but: http://our.umbraco.org/wiki/how-tos/customizing-the-wysiwyg-rich-text-editor-(tinymce)/allow-any-markup-in-the-tinymce-editor -->

Unfortunatly, this setting is buggy: <![CDATA[*[*]]]> in the recent version of tinymce.
You will have to use the config option valid_elements and set the attributes as valid there.

Related

How to generate thumbnail for attachment uploaded via the confluence rest API

I am creating a tool for our own confluence server using python 3.7.3 that uploads attachments to correct pages. What I'd like to do is create a thumbnail from the uploaded documents to the pages. I am using atlassian-rest-api to upload documents and update the pages.
How the manually inserted document looks like on page
How the document is opened when clicked
I am already able to upload attachments and I copied the HTML generated by the confluence to "copy" the structure for inserted attachment. After I upload the document I generate new HTML with the attachment such as this :
<span class="confluence-embedded-file-wrapper conf-macro output-inline has-comment-overlay" data-hasbody="false" data-macro-name="view-file">
<a class="confluence-embedded-file" href="/download/attachments/{page_id}/{filename}?version={file_version}&modificationDate={file_mod_date}&api=v2" data-nice-type="text.plain" data-file-src="/download/attachments/{page_id}/{filename}?version={file_version}&modificationDate={file_mod_date}&api=v2" data-linked-resource-id="{file_id}" data-linked-resource-type="attachment" data-linked-resource-container-id="{page_id}" data-linked-resource-default-alias="{filename}" data-mime-type="application/msword" data-has-thumbnail="true" data-linked-resource-version="{file_version}">
<img src="/rest/documentConversion/latest/conversion/thumbnail/{file_id}/{file_version}" height="250" width="180">
</a>
<span class="overlay">
<span class="file-type-desc-overlay">
<i class="aui-icon aui-icon-small aui-iconfont-file-doc"></i>
<span class="content">Document</span>
</span>
</span>
</span>
This is copied from the HTML of the manually inserted document, but the difference is that the links seems broken even thought manually clicking the source actually downloads the file. Also the thumbnail picture seems to be missing because of some automatic conversion process that is taking place when inserting the document into page.
This is how my attempts looks like
When clicking the document
Do you have any idea how the thumbnails should be handled on page for Confluence API?
edit: It would also seem that during update the site drops part of the attributes added?
versus when checking the data just before sending it to the API
<span class="confluence-embedded-file-wrapper conf-macro output-inline has-comment-overlay" data-hasbody="false" data-macro-name="view-file">
<a class="confluence-embedded-file" data-file-src="/download/attachments/66527941/API test page document.txt?version=50&modificationDate=2020-07-07T15:03:36.916+03:00&api=v2" data-has-thumbnail="true" data-linked-resource-container-id="66527941" data-linked-resource-default-alias="API test page document.txt" data-linked-resource-id="66528088" data-linked-resource-type="attachment" data-linked-resource-version="50" data-mime-type="application/msword" data-nice-type="text.plain" href="/download/attachments/66527941/API test page document.txt?version=50&modificationDate=2020-07-07T15:03:36.916+03:00&api=v2">
<img height="250" src="/rest/documentConversion/latest/conversion/thumbnail/66528088/50" width="180"/>
</a>
<span class="overlay">
<span class="file-type-desc-overlay">
<i class="aui-icon aui-icon-small aui-iconfont-file-doc"></i>
<span class="content">Document</span>
</span>
</span>
</span>
Edit:
According to this article part of the attributes that are not "white listed" are being removed by the APIs parser. This means that we can not manually update the page with elements that have custom defined attributes (such as the data-# which are used by the confluence elements).
Edit 2:
Upon further investigation the missing attributes are probably caused by the storage format. Using this information I was able to construct similar thumbnail with the following template:
<ac:link ac:anchor="anchor">
<ri:attachment ri:filename="API test page document.txt"></ri:attachment>
<ac:link-body>
<ac:image ac:height="250" ac:width="250" ac:border="true" ac:class="confluence-embedded-file">
<ri:url ri:value="/plugins/servlet/view-file-macro/placeholder?type=unknown&name=API test page document.txt&attachmentId={attachment id}&version={attachment version}&mimeType=application/binary&height=250"></ri:url>
</ac:image>
</ac:link-body>
</ac:link>
This produced the following result:
Also when clicked it had similar interaction as the one made with editor:
How ever this is still not the result as you get when using the insert with editor.
Note. If you use this method, the thumbnail link seems to break if you update anything on the page manually using the editor.
Edit 3: The thumbnail seemed to get broken during editing because of relative URL. Switch the <ri/url ri:value="/plugins/servlet/..."> to <ri:url ri:value="https://your.wiki.address/plugins/servlet/..."> and it will no longer break
The issue was resolved by changing the html implementation to implementation as I have described in Edit 2 and the secondary issue with the link getting broken is to use absolute URL as said in Edit 3.

<br> is appearing after <span> element on the front end, but it is not in the back end. How do I get rid of it?

I am using the Enfold theme from Kriesi, the Avia layout builder, the 'tab' setup in the Avia layout builder, the plugin 'Ultimate FAQ', and shortcodes from that plugin to pull through different categories of FAQs.
In my first tab, I have this code:
<span class="ewd-ufaq-expand-all"><span class="ewd-ufaq-toggle-all-symbol">c</span> Expand All</span>
<span class="ewd-ufaq-collapse-all ewd-ufaq-hidden"><span class="ewd-ufaq-toggle-all-symbol">C</span> Collapse All</span>
[select-faq faq_id='6486']
[ultimate-faqs include_category='a-to-c']
[ultimate-faqs include_category='d-to-h']
[ultimate-faqs include_category='i-to-n']
[ultimate-faqs include_category='o-to-v']
[ultimate-faqs include_category='w-to-z']
The <span> stuff is code from the FAQ creator to expand/collapse all FAQ answers. Here is my page, if you use the page inspector it shows two <br>s between the expand/collapse control and the FAQ.
But I didn't put any <br> there. I don't know where they're coming from.
I'm going to guess that this is a Wordpress question.
By default Wordpress injects <br> tags into the content where a new line occurs. It's designed to help content writers. You won't see it in the text editor as it's added behind the scenes via a 'filter'.
Unfortunately the only way you're going to overcome your problem is to write PHP or JS code in the theme file itself. HTML and CSS will not work here. There's a question that deals with this problem using filters here.
EDIT: Something you could try is removing any whitespace between elements to stop the tags being injected i.e;
<span class="ewd-ufaq-expand-all"><span class="ewd-ufaq-toggle-all-symbol">c</span> Expand All</span><span class="ewd-ufaq-collapse-all ewd-ufaq-hidden"><span class="ewd-ufaq-toggle-all-symbol">C</span> Collapse All</span>[select-faq faq_id='6486']
[ultimate-faqs include_category='a-to-c']
[ultimate-faqs include_category='d-to-h']
[ultimate-faqs-include_category='i-to-n']
[ultimate-faqs include_category='o-to-v']
[ultimate-faqs include_category='w-to-z']
This will largely depend on how the editor works though so no promises.
This works via console. Try placing it in a script tag prior to the end of the body element.
jQuery(function() {
jQuery('span.ewd-ufaq-collapse-all').next('br').remove();
});

H2 ADA Violation

I am getting H2 violation for below anchor tags.
It says 'H2: Combining adjacent image and text links for the same resource'
<div class="selected-label ccyImage">
</div>
<a href="javascript:void(0);" class="btn dropdown-html-toggle" tabindex="-1">
<span class="caret"></span>
</a>
But there is no any image used. Not getting how to resolve it.
So you have some unspecified tool which is detecting an accessibility problem which is different to the accessibility problem you actually have (or it is being really smart and noticing that you are expressing content using background images … don't do that).
There's not much you can do about the misidentification of the problem other than to report a bug to whomever makes the tool.
You can make your HTML more accessible by:
Not using links when you aren't linking somewhere. If you're using href="javascript:void(0);" then you're doing something wrong.
Link to somewhere useful and progressively enhance or
Use a button (not a link) if you can't make it work without JS
Putting content in your links (or buttons). There is no text at all there to give any clue to the user what the interactive element is going to do.

Angular. html injecting from tinymce

In my admin panel i got tinymce textarea. I want to preview all my changes.
i get html from tinymce
$scope.text_in = tinymce.get('myTextAreaName').getContent();
and it looks like:
<h1 style="text-align: center;">AAA</h1>
<h3 style="text-align: center;">BBB</h3>
I'm using ngSanitize and my element where i want to inject HTML
<div class="new" ng-bind-html="text_in"></div>
Thats looks ok, BUT. I got html markup without inline styles. All other styles from css works fine.
When i inspecting element there are
<h1>AAA</h1>
<h3>BBB</h3>
What am i doing wrong?
According to Angular documentation you can by pass the sanitize.
http://docs.angularjs.org/api/ngSanitize.$sanitize

Can I have attributes on closing tags?

There are many people that mark closing tags like this to help identify the closing tag that goes with an HTML tag:
<div id="header">
<div id="logo">
<a href="index.php">
<img id="logoimg" src="images/as_logo.png" alt="Logo" border="0" />
</a>
</div> <!-- logo -->
</div> <!-- header -->
I was wondering if it is syntactically ok to do this:
<div id="header">
<div id="logo">
<a href="index.php">
<img id="logoimg" src="images/as_logo.png" alt="Logo" border="0" />
</a>
</div id="logo">
</div id="header">
UPDATE: Here is the text from the spec on HTML5.3:
8.1.2.2. End tags
End tags must have the following format:
The first character of an end tag must be a U+003C LESS-THAN SIGN
character (<).
The second character of an end tag must be a U+002F
SOLIDUS character (/).
The next few characters of an end tag must be
the element’s tag name.
After the tag name, there may be one or more
space characters.
Finally, end tags must be closed by a U+003E
GREATER-THAN SIGN character (>).
8.1.2.3. Attributes
Attributes for an element are expressed inside the element’s start tag.
Note that attributes are only allowed on START TAGS.
using #jbyrds idea; using the HR tag allows you to see if you forgot the z attribute:
<div id="header">
<div id="logo">
<a href="index.php" id=link">
<img id="logoimg" src="images/as_logo.png" alt="Logo" border="0" />
</a><hr z="link">
</div><hr z="logo">
</div><hr z="header">
Although this adds more text, 32 extra characters vs. the original or the tags having a hidden class, you can use CSS to hide them.
[z] {
display: none;
}
Short answer, No.
Use the comments instead.
The answer is no for most tags. However, you could argue that tags like "img" that can be self-closing, are able to have attributes in them. But these self-closing tags are taking the place of an opening tag and a closing tag, so it's not the same as having an attribute in a closing tag. To be honest, there is really no need for this, it would just create more for the browser to have to read and make the page size bigger.
Sorry, but it doesn't work and doesn't validate.
If you try other attributes in closing tags, then the browser skips the attribute. I tried it in several ways, tested it with ids and classes, and the css and the javascript didn't recognized them in the ending tag.
Your best bet is the commenting.
EDITED
Or you could make your own html tags.
You must use hyphenation, and you should avoid
document.createElement('foo-bar');
no, not possible. some browser will ignore it, but maybe some other browsers will complain and won't display HTML correctly.
The original question describes a specific scenario of four parts:
improving html code readability, and specifically: matching opening and closing <div> … <\div> tags;
while reading (debugging) the rendered html page source grabbed from the browser;
when the rendered source has been dynamically generated (server-side generated/processed) and also stripped of all comments before sending the webpage to the requesting client;
and in this case the question is specific to WordPress (the well known php CMS platform for creating websites, blogs, etc.).
The specific complexity here is that there is no one source file to look at on the server as the webpage was dynamically generated by code with input from many files, databases, APIs, etc.
AND
That as previously noted, a common technique of placing a comment at the end of each closing <\div> is not helpful here because Wordpress, has stripped all comments prior to serving the page, presumably to make the page size smaller.
A Javascript Solution:
Forget about trying to hack the html in an effort to circumvent WordPress and the browser and the standards. Instead simply re-insert the comments back into the rendered source like this.
When matching opening <div id="myDivID"> and
closing </div> tags this javascript may help.
This function will comment every closing div tag
and label it with the div’s ID attribute, producing
a result like this:
<div id="myDivID">
<p>
Lorem ipsum dolor sit amet, consectetur
...
anim id est laborum.
</p>
</div><!-- end #myDivID -->
This will work even when the rendered page is
stripped of comments (by WordPress as in the
original question). Just trigger or inject the
function at any point you like, then view or save
the source. As others noted previously, using
comments doesn't violate the spec as some other
suggestions may.
This short function should be easy to understand and
to modify for similar purposes. (Note the insertBefore
workaround, as there is no JS insertAfter method.)
var d = window.document;
insertCommentAtDivCloseTag(d);
function insertCommentAtDivCloseTag(document) {
var d = document;
var divList = d.getElementsByTagName('div');
var div = {};
for (div of divList) {
var parent = div.parentNode;
var newNode = new Comment(' end #' + div.id + ' ');
parent.insertBefore(newNode, div.nextSibling);
}
}
This is the quick and easy one-off solution. If that’s all you need skip the rest...
If WordPress/web development is something you do everyday you may wish to consider exploring some of the following:
Hack WordPress
Again forget about hacking the HtML standard and hack wordpress instead. In fact WordPress is designed to be hacked. Virtually every function WordPress uses in creating a webpage has a hook that you can use to override or alter what it does.
Codex, The Rosetta Stone of WordPress
Find the one stripping out your comments and add a function to turn it off and on.
If it’s been thought of before, there’s already a plugin for it.
WordPress Plugins Home Page
WordPress plugins come and go, some are maintained others not, some are very good, and some are poorly designed, some are just bad. So caveat emptor. With that proviso, I was able find such a plug-in in ten seconds, with one search, on the first try.
Beyond WordPress
For so many reasons, it is beneficial to serve the smallest possible version of your webpage and WordPress may not be the only actor dynamically altering your code or caching older versions.
Your WordPress Installation and Blog Post (database)
↓
WordPress Theme
↓
WordPress Plugins
↓
The HTTP Server, such as Apache and it’s Modules
↓
A Proxy Server such as Nginx
↓
A Hosting Provider
↓
A CDN, Content Delivery Network
↓
(More Network)
↓
Finally My Browser the Client
↑
Also any Caches Maintained by Any of the Above
Finally, if this sort of thing is or is becoming your job, you’ll eventually want to explore specialized IDEs and separate production and development servers.
Accepting that the simple answer is No, my entire HTML life, I've identified closing tags by following with a comment. But long ago, it became impossible to nest comments. So when debugging, it is a royal pain to comment out a <DIV>...</DIV>, because of that identifying comment. Closing the DIV comment this way makes the comment closure hard to spot.
<!--
</DIV>--><!-- END DIV NAMEOFDIV -->
It is better placed on its own line, but this both hard to read and involves too much temporary manipulation...
<!--
</DIV>
-->
<!-- END DIV NAMEOFDIV -->
I'm no expert in such issues, but from this HTML end-user's view it seems absolutely absurd that a closing DIV can't be easily identified.
I will experiment with other kludges, such as adding a useless <i CLOSES NAMEOFDIV></i> tag. Or maybe a fake, meaningless tag? (e.g., <ENDDIV NAMEOFDIV>) (The nondisplayed HR z= trick is neat, but yet another visual confusion.)
We really shouldn't have to. What were the powers that be thinking?