Javadocs without HTML - html

Robert C. Martin's book Clean Code contains the following:
HTML in source code comments is an abomination [...] If comments are going to be extracted by some tool (like Javadoc) to appear in a Web page, then it should be responsibility of that tool, and not the programmer, to adorn the comments with appropriate HTML.
I kind of agree - source code surely would look cleaner without the HTML tags - but how do you make decent-looking Javadoc pages then? There's no way to even separate paragraphs without using a HTML tag. Javadoc manual says it clearly:
A doc comment is written in HTML.
Are there some preprocessor tools that could help here? Markdown syntax might be appropriate.

I agree. (This is also the reason why I am -strongly- opposed to C#-style "XML comment blocks"; the Javadoc DSL at least provides some escape for top-level entities!). To this end I simply do not try to make the javadoc look pretty...
...anyway, you may be interested in Doxygen. Here is a very quick post Doxygen versus Javadoc. It also brings up the issues that you do :-)

HTML is nothing I'd like to see in "normal" comments. But for Tools like JavaDoc, HTML adds the possibility to add formatting information, bullet points etc...
I would distinguish these two things:
non-javadoc code comments are for the programmer who maintains or enhances the code i question. he has to dig through existing sources, and any HTML in coments just doesn't make things easier. So, ban it in normal comments.
javadoc-comments are used to generate documentation. Use HTML where it helps. But a very limited subset of HTML should suffice.

Related

Which html is supported in Jenkins job description

In the Job description you can use Html tags.
I have something like:
blabla.. on http://vms029/wa_shdw
But the target="_blank" seems to get scrubbed somewhere.
Is there another way?
Any doc on whats supported and what's not?
Jenkins allows you to use various markup languages to write job descriptions; plugins can define how the description should be parsed via the MarkupFormatter interface.
By default, the RawHtmlMarkupFormatter is used, which applies an HTML sanitisation policy (from the OWASP AntiSamy Project) — the Myspace policy.
In the Myspace policy, you'll see that only certain tags and attributes are allowed. target isn't one of them, which is why you see it being stripped from your input.
For your use case, the alternatives are to install and configure another markup formatter plugin, or to write your own. Some examples include:
Escaped Markup Plugin: escapes all HTML tags (probably not so useful for you)
"Anything Goes" Formatter: allows any HTML input at all (with the associated security risks)
PegDown Formatter Plugin: lets you write your descriptions in Markdown (probably the nicest option here, but likely doesn't support things like target="_blank")
Because I had the fun of trying to figure out what exactly should work, documentation is light on usable details, and I don't want to have to do this again in a year or two, here goes:
Any references to RawHtmlMarkupFormatter are obsolete by now. As a comment said, the "safe html" markup is now provided by OWASP Markup Formatter Plugin (antisamy-markup-formatter). The actual tags it permits are visible indirectly in the BasicPolicy which uses org.owasp.html.Sanitizers. These two references together allow figuring out what's really supposed to be ok.
For example <font color=...> used to work back in the day (see MyspacePolicy in the other answer), but appears to no longer be allowed, but enough simple <span style="color:..."> styles are permitted to get somewhere equivalent. This matches the observed behavior of OWASP Markup Formatter 2.0 on a Jenkins instance.

Structuring html document

I have taken over some software which produces a html document with no structure.
The HTML in it self is good enough. Well enclosed and nested and what not but it is almost impossible to read with the human eye as the linebreaks are how the tekst editor, used to view the document, pleases.
So, my question is as follows.
Does any of you know a online parser or program that allows the showing of a messy, more or less minified html document, to show a human readable document? Preferablly also indenting he various tags to show nested levels of the tags
Thanks in advance
Try this maybe (just picked the first link for a 'html online tidy' google search). http://infohound.net/tidy/
Try this.
It is online and it is free.
Almost any HTML editor will have this capability. For instance: HTMLKit
The JS Beautifier also works with HTML: http://jsbeautifier.org/
There are other, similar tools available online if you search.

Writing a book for both print and HTML which can include code samples

I want to write a book on programming. I need to target both print and HTML.
In order not to get burned with the code examples, I need to be able to include parts of source code which have been marked up with start and end points to ensure the code is up to date and compiles. Extract the code from external files if you will.
I would like some simple format such as Txt2tags rather than latex since I then can use word's fine spelling capabilities.
Any experiences you want to share?
It is important to note that by starting with Txt2Tags you will be able to export your documents into LaTex. To my knowledge this is a one-way street, so by starting with Txt2tags you can still have the flexibility of LaTex, but by going with LaTex you don't get the benefits of Txt2tags.
Firstly, don't dismiss LaTeX too rapidly. Although it can be a bit of a pain to spellcheck, it's still quite doable with tools like aspell.
That being said, I would highly recommend using emacs' org-mode. It will provide you with a nice foldable overview of your book's structure, and is much more readable in plain text than LaTeX. Additionally, since it uses emacs' native syntax highlighting when you export (to HTML, LaTeX, PDF, etc) you'll be able to write the code inline (between #+begin_src tags) and get a much more precise WYSIWYG view of the code snippets you include.
Since emacs will work with aspell out-of-the-box, you'll still be able to check spelling as you work. Also, it uses LaTeX as an export format, which means you can obtain the same professional/technical look that LaTeX affords.
I see it has been reported as a missing feature on the text2tag homepage...

Tag <code>: how to "correct" publish it?

i'm not sure to explain what i'm looking for.
What's the name of the "source code parser" for publish code, in HTML ?
For example, when i write some source code here in stack overflow, system auto-detect the sintax and write "correct" source code in html.
I've noticed that exists the HTML <"code"> tag, but it simply write source code in "courier" font.
So i'm asking you if exists some "external" component that, given a text, parse it out correctly in a HTML page.
Thank you!
SO uses prettify to syntax highlight the <code> snippets.
Source: Which tools and technologies were used to build the Trilogy?
It is a JavaScript tool that scans a page for code snippets, and colours them on the fly. The downside of this solution is that it doesn't work with JavaScript turned off. Seeing as syntax colouring is not really an essential task, it is arguably a small downside.
The system is called Markdown and here is an explanation of the code blocks it uses.
For the syntax highlighting that you mentioned, a different system is used called prettify.
There are two components to this:
The CSS/HTML structure for syntax highlighting (e.g. styles for printing keywords, #s, strings, comments etc... in certain colors). This can be generic or per-language.
The code parser (grammar parser), which breaks the code up into tokens and labels the tokens with the appropriate classes. This can be implemented on either back-end via whatever language your back-end is in; or on front-end via JavaScript (the example of the latter is Google's Code Pretty which is used by StackOverflow).
It can be coupled with some heuristic logic to decide what language the code belongs to (and thus which grammar/parser to use).

WYSIWYG browser editor that generates *good* HTML?

I'm searching for a "suck less" WYSIWYG in-browser X?HTML editor that generates good HTML code.
(no <font>, <foo style="...">, <p></p><span></span><p><span> </span><span><span>blah</span></<span></p> and so on -- <b> and <i> etc is ok).
Should be easy-to-use as it is going to be used by people that do not know what HTML is.
Any suggestions?
Extra points for Copy-and-Paste-from-Word-readiness! :-)
(I found a lot of editors but they all create that <font> and nested <span> crap that breaks site design and bloats a site with one table up to 100kB.)
Download the current version of CKEditor and look at the XHTML output sample. It shows how to use full WYSIWYG but it doesn't generates font or styles. You just need to adjust the configuration to your needs.
What about WYMEditor?
WYMeditor has been created to generate perfectly structured XHTML strict code, to conform to the W3C XHTML specifications and to facilitate further processing by modern applications.
With WYMeditor, the code can't be contaminated by visual informations like font styles and weights, borders, colors, ... The end-user defines content meaning, which will determine its aspect by the use of style sheets. The result is easy and quick maintenance of information.
I've used it a little and while it takes quite a bit of tweaking if you have very specific needs, it does work out of the box for simple XHTML editing. If you set up specially annotated CSS files then it will detect the styles you want users to use and block level elements to which they apply. You can also tell it how to display these styles in the editor (which might be different from how you want them displayed in the resulting XHTML).
Of course, it generates XHTML, not HTML, so it may not meet your exact needs.
Wikipedia has a category for them:
http://en.wikipedia.org/wiki/Category:JavaScript-based_HTML_editors
You can use Markdown with the WMD UI, it's the one used by Stack Overflow. It always produces valid HTML code.
I just recently searched for an editor to create solid documentation, whose output is suitable for Subversion diffs: https://superuser.com/questions/126621/wysiwyg-editor-for-structured-text-suitable-for-svn-versioning
The editor that was suggested - "KompoZer" - turned out to be fantastic, especially because it generates very clean HTML (in my opinion). And I say that, although I had originally preferred something leaner than HTML.
P.S. Reading your question again, I'm not sure, what you mean with a "browser editor" - are you looking for an editor that can be integrated in an HTML page? KompoZer is based on a browser, but it can probably not be integrated in an HTML page.
I recently switched one of my projects to markdown to avoid this exact issue. There's still a bit of a learning curve for the users but I haven't had to deal with the usual issues that occur when they copy/paste content from Word and wonder why it blew up.
Having said that, I prefer CKEditor over TinyMCE and the Telerik controls. I've generally found it generates somewhat cleaner HTML.
There are several WYSIWG editors for embedding within your website out there.
WYMeditor (http://www.wymeditor.org/) looks very nice and seems to be a good fit for targetting clean and valid XHTML results.
Spaw2. Although it's kinda abandoned now.
The Apple Cocoa NSTextView class exports quite nice html, where all the fiddling is done through specifying a style sheet in the header. The Apple TextEdit editor uses this.
http://tinymce.moxiecode.com/ - easy to use, can import form Word, and restrict formatting to predefined CSS styles, to provide consistent output.
This post is 8+ years old now but still relevant...
I found an awesome github page with a curated list of WYSIWYG editors, including a few WYSIWYM ones which guarantee sane html. As of 2018, the most current and best WYSIWYM one looks like ProseMirror, or maybe ORY Editor if you're looking for something to edit entire webpages(!) in one textfield.