How to easily rewrap extracted part of HTML into new document - html

I use the XML::Twig Perl module to extract a div out of an HTML document, and would like to create a new HTML doc containing only this div, and the required HTML wrapping. I would then also add some CSS styles to the new document.
Extracting the div is easy, but I'm too lazy to write the HTML wrapping around it myself :-).
There surely must be a Perl module which would do that boring part for me. Or maybe even a method in XML::Twig itself, which I overlooked or didn't understand?

You should be able to create a (very, very!) minimal HTML twig by using my $html= XML::Twig->new->parse_html( '') and then paste your div in it. You may want to`replace the empty string by something a little more HTML-y though, or even load a better HTML template, which could have the CSS in it too.

Related

NSXMLParser - modify the elements in iPhone

I have an HTML string that I need to modify its elements.
I looked into NSXMLParser, but didn't see any method to modify the elements while reading them.
I don't like the solution of creating a NSMutableString and adding strings to it.
Is there a way to read HTML string and modify its element in an elegant way?
e.g.,
<div style="color:grey"></div>
will be
<div style="color:black"></div>
Unfortunately I saw that one cannot use NSXMLDocument in an iPhone app.
see the github project KissXML which is a 'clone of NSXML library and works the same.
https://github.com/robbiehanson/KissXML
shameless self-advert: my fork of it works way better with html that's not real xml
https://github.com/Daij-Djan/KissXML

Add big piece of HTML to the html-panel

I am new to GWT and I need to add a big piece of html code (contains a lot of included divs with id and classes) to the html-panel widget in my java file.
I have tried to add like this:
HTML html = new HTML("<div class=\"class1\">This is a class1.");
HTML html2 = new HTML("And it ends here</p>");
RootPanel.get().add(html);
RootPanel.get().add(html2);
But I have a problem with included divs. Is there any simpler way to this big piece of code. Thanks.
You, sir, are looking for UiBinder:
https://developers.google.com/web-toolkit/doc/latest/DevGuideUiBinder
UiBinder is great when you are doing a lot in plain HTML.
However, UiBinder offers a lot once you get in deep so be careful. I recommend looking into CssResource and how it releates to UiBinder so you can share some Css or just embed Css in each UiBinder file. (Note programatic access to Css within UiBinder files)
Also not other features such as importing other custom UiBinder/Widgets with namespaces such as the built in (< g:Button> -> < mynamespace:MyCustomWidget>)
But you are probably just looking for laying everything out in UiBinder and defining the #UiField's in the java file
Hope this helps!
-Ashton
If your HTML is static, you must use UiBinder.
If your HTML is generated dynamically, you can use com.google.gwt.user.client.ui.HTML.
Your code is not working because you must give to the HTML constructor a valid html string.
I suggest you to read this guide:
https://developers.google.com/web-toolkit/doc/latest/DevGuideSecuritySafeHtml#Prefer_Plain_Text.

HTML - insert user-created HTML into a HTML page: escaping and discarding format

I have an HTML page which needs to display some HTML generated by the user on the Administration area (like a blog, for instance). The problem is that the user sometimes needs to copy-paste tables and other "garbage" content from Word/Excel to the WYSIWYG editor (that has the proper "paste from Word" function). This causes the resulting HTML code to be very dirty.
It wouldn't be a problem unless some of these pages are shown totally wrong: other divs AFTER user's HTML code are not in their supposed position, floats are not respected... etc...
I tried putting a div:
<div style="clear: both;"></div>
without success. I even tried with iFrames, but iFrames accept only external webpages (if applicable...).
The question is: is there any tag or method to put a part of an HTML code inside a webpage discarding all formatting AFTER this code?
Thank you.
To my knowledge, you simply want to end all divs. HTML is a very simple code, with very simple options. Your example doesn't work because HTML isn't that advances. You can either start a function <...> or end a function .
Ideally what you want is a piece of code that puts their work in a separate frame entirely, so as soon as the page passes their code, it goes back to the correct formatting.
Or, you could be really sloppy and put one hundred 's in, just in case.

How to display source code with indent in a web page? HTML? CSS?

I want to show some source code with the WebBrowser control on a winform. And I want to decorate the source code with HTML tags, such as color, font, and size. But I found it difficult to display the indent properly.
To be precise, my source code are held in String[], and each String holds the proper indent (space or tab) already. But it seems these kinds of indent are just ignored by the WebBrowser control.
Could someone tell me how to?
I like to paste my code in a Gist and then display it that way. Github will recognize the code and format it accordingly.
If you're going to be doing it often you could try markdown.
Or use a one-off formatter like Syntax Highlighter.
The <pre> element (using <code> elements with appropriate class names to mark up the parts you want to syntax highlight)
<pre><code class="javascript"><code class="keyword">function</code> <code class="name">foo</code>()…
You might want to look into this JavaScript library to highlight and format your code. http://code.google.com/p/syntaxhighlighter/
Or you can check out a service like this - http://pygments.appspot.com/ or this - http://hilite.me/

How to sanitize user generated html code in ruby on rails

I am storing user generated html code in the database, but some of the codes are broken (without end tags), so when this code will mess up the whole render of the page.
How could I prevent this sort of behaviour with ruby on rails.
Thanks
It's not too hard to do this with a proper HTML parser like Nokogiri which can perform clean-up as part of the processing method:
bad_html = '<div><p><strong>bad</p>'
puts Nokogiri.fragment(bad_html).to_s
# <div><p><strong>bad</strong></p></div>
Once parsed properly, you should have fully balanced tags.
My google-fu reveals surprisingly few hits, but here is the top one :)
Valid Well-formed HTML
Try using the h() escape function in your erb templates to sanitize. That should do the trick
Check out Loofah, an HTML sanitization library based on Nokogiri. This will also remove potentially unsafe HTML that could inject malicious script or embed objects on the page. You should also scrub out style blocks, which might mess up the markup on the page.