For our HTML templates, we use a content aggregation engine based on a namespaced XML-based markup syntax that has some custom tags defining various fragments of the template. For example, for js code we have:
<foo:js src="http://..." />
<p>
This is regular HTML; this document with the namespaced tags will be
aggregated into HTML by our metalanguage aggregator.
</p>
<foo:js>
console.log('bar');
</foo:js>
I'd be very happy if these could be parsed and edited as JS code or JS reference tags (the self-closing one) in phpStorm. I figured out the "Language injections" settings is designed for that, but I can't find any documentation how to achieve this. The "Places Patterns" seems to be the key to this enigma. So can anyone provide some hints into how to do this using Language Injection(s)?
Also, for the rest of the tags, as a simple form of completion/validation, is it possible to specify a default XML schema (i.e. XSD) for the namespace? I mean, apart from explicit xmlns:foo tags, because our markup files represent fragments so we'd have to add quite a number of these on quite a lot of elements.
The following injection will highlight javascript in all <namespace:js> tags:
Related
Assuming I am in control of the parsing environment and I'm certain it is only to be converted to HTML (and not any of the many other formats possible); is it ok to embed some HTML within one's Markdown, in order to side-step around a bug?
Could there be any basic sideffects I (as a newbie) couldn't predict but should be aware of?
Non-conventional Markdown example:
_"<strong>This</strong> is an example sentence."_ -**OP**
Which outputs valid HTML:
<em>"<strong>This</strong> is an example sentence."</em> -<strong>OP</strong>
Resulting in successful content:
"This is an example sentence." -OP
Background (don't have to read):
I noticed that if I include HTML in my Markdown, it appears to get skipped during the conversion, resulting in it being seamlessly incorporated in the output HTML.
This appears to be a good thing, at least in my case (Using Hugo to build a website with a template theme) where the Markdown wasn't producing the correct result (leaving a pair of unwanted *s in the HTML: should have been *italic* but asterisks showing).
For those wondering - yes, I confirmed my Markdown was correct using other parsers that handled it fine.
Note: the examples here are simplifications of my specific case
Not only is it okay to do, but it is encouraged. As the rules state:
For any markup that is not covered by Markdown’s syntax, you simply use HTML itself. There’s no need to preface it or delimit it to indicate that you’re switching from Markdown to HTML; you just use the tags.
And later:
If you want, you can even use HTML tags instead of Markdown formatting; e.g. if you’d prefer to use HTML <a> or <img> tags instead of Markdown’s link or image syntax, go right ahead.
Of course, there are a few things to take into consideration. For example block level tags must be at the document root level (cannot be nested inside blockquotes, lists, etc) and content inside them does not get parsed as Markdown. However, inline tags can be placed anywhere and do not restrict Markdown parsing.
For people using Markdown in highly modular or user-flexible environments (probably slightly more advanced readers):
One should note that although Markdown is most commonly converted to HTML, it can also be used with other formats[1].
For this reason I think it's important to confirm that if you (as a publisher of content) are not the one who determines what the Markdown will be parsed with, or how it is converted it may be 'safer' to not embed HTML in it.
[1] as stated in the Markdown Wikipedia page.
I'm generating a XML documentation using Doxygen. I'm formatting my descriptions with Markdown and I'm perplexed that the tags that are in between the "detaileddescription" tag are not HTML tags. For example.
<heading level="1">Example</heading>
I checked the configuration file of Oxygen to see if it possible to force it to generate HTML tags instead of the custom ones. I also Googled to see if someone had the same problem, but nothing.
Do you see a solution to this problem I should i just write a converter by myself?
While (in theory) doxygen could have produced XHTML for the documentation blocks inside the XML output, it does not do so.
So yes it is a different format and if you need HTML you will need to translate it.
Reasons behind this.
Flexibility: If someone wants to produce another output format, starting from XHTML would be more difficult than using the same internal representation that doxygen itself uses to produce all output formats.
Legacy: Doxygen already produced XML before there was a XML compliant standard for HTML (i.e. XHTML)
Is there a tool available which would convert the sources given into HTML with links?
By links I mean that every type, class, and method used would point via href to its definition.
I haven't managed to make highlight, syntax-highlight, nor pygments work this way. Even if it supports input from ctags, it only adds the title attribute, but not links.
Highlight can easily be modified to support things such as adding links to function / class definitions, as well as manual entries.
I was able to hook on to the class and function detection, and have each instance linked to the PHP Manual in my testing. I don't know what you'd want yours to link to, so it's your choice (per language, of course.)
Depending upon the language of your source code you might want to use doxygen. It supports a variety of source languages and can export the comments to HTML and LaTeX.
Many modern languages, like Java or C# support XML-comments to document the source code. You can then extract these comments into a single XML file by compiling them with special options. From this XML you can then easily produce HTML by adding an appropriate CSS sheet. MSDN documentation, for example, is largely based upon these HTML files generated in automated mode.
I have some html (in this case created via TinyMCE) that I would like to add to a page. However, for security reason, I don't want to just print everything the user has entered.
Does anyone know of a templatetag (a filter, preferably) that will allow only a safe subset of html to be rendered?
I realize that markdown and others do this. However, they also add additional markup syntax which could be confusing for my users, since they are using a rich text editor that doesn't know about markdown.
There's removetags, but it's a blacklisting approach which fails to remove tags when they don't look exactly like the well-formed tags Django expects, and of course since it doesn't attempt to remove attributes it is totally vulnerable to the 1,000 other ways of script-injection that don't involve the <script> tag. It's a trap, offering the illusion of safety whilst actually providing no real security at all.
HTML-sanitisation approaches based on regex hacking are almost inevitably a total fail. Using a real HTML parser to get an object model for the submitted content, then filtering and re-serialising in a known-good format, is generally the most reliable approach.
If your rich text editor outputs XHTML it's easy, just use minidom or etree to parse the document then walk over it removing all but known-good elements and attributes and finally convert back to safe XML. If, on the other hand, it spits out HTML, or allows the user to input raw HTML, you may need to use something like BeautifulSoup on it. See this question for some discussion.
Filtering HTML is a large and complicated topic, which is why many people prefer the text-with-restrictive-markup languages.
Use HTML Purifier, html5lib, or another library that is built to do HTML sanitization.
You can use removetags to specify list of tags to be remove:
{{ data|removetags:"script" }}
I have an FAQ in HTML (example) in which the questions refer to each other a lot. That means whenever we insert/delete/rearrange the questions, the numbering changes. LaTeX solves this very elegantly with \label and \ref -- you give items simple tags and LaTeX worries about converting to numbers in the final document.
How do people deal with that in HTML?
ADDED: Note that this is no problem if you don't have to actually refer to items by number, in which case you can set a tag with
<a name="foo">
and then link to it with
some non-numerical way to refer to foo.
But I'm assuming "foo" has some auto-generated number, say from an <ol> list, and I want to use that number to refer to and link to it.
There is nothing like this in HTML.
The way you would normally solve this, is by having the HTML for the links generated, by either parsing the HTML itself and inserting the TOC (you can do that on the server, before you send the HTML out to the browser, or on the client, by traversing the DOM with a little piece of ECMAScript and simply collecting and inspecting all <a> elements) or generating the entire HTML document from a higher level source like a database, an XML document, markdown or – why not? – even LaΤΕΧ.
I know it's not widely supported by browsers, but you can do this using CSS counter.
Also, consider using ids instead of names for your anchors.
Instead of \label{key} use <a name="key" />. Then link using Link.
PrinceXML can do that, but that's about it. I suppose it'd be best to use server-side scripting.
Here's how I ended up solving this with a php script:
http://yootles.com/genfaq
It's roughly as convenient as \label and \ref in LaTeX and even auto-generates the index of questions.
And I put it on an etherpad instance which is handy when multiple people are contributing questions to the FAQ.