NSXMLParser cannot handle some HTML entities like à - html

Has anyone run into the XML parser just ending when it encounters HTML entities like Ã?
Thanks
Deshawn

Yes, this problem is because the the XMLParser is using element validation, which will break if it sees an html element like the one you described. If you want to parse HTML you will want to use Hpple.
Check out this post for more information.
parsing HTML on the iPhone

Related

My backbone marionette model contains a field with escaped html in it. How do I render that field's contents as HTML and not text?

Classic problem. Want to see html rendered but I'm seeing text in the browser. Whether I tell handlebars js to decode it or not in template ( three curly braces vs two - {{{myHtmlData}}} vs {{myHtmlData}} ) doesn't get me there. Something about the JSON being returned via the model.fetch() has this html data wrapped up in such a way that it is resistant to the notion of displaying as HTML. It's always considered a string whether encoded or decoded so it always displays as text.
Is this just something backbone isn't meant to do?
The technologies involved here are:
backbone.marionette
handlebars.js
.NET Web API
Your data is being escaped automatically. It's a good thing, but since you're sure the data is a safe HTML. Use {{{}}} as in this other question Insert html in a handlebar template without escaping .

Objective-C event-driven HTML parsing

I need to be able to parse HTML snippets in an event-driven way. For example, if the parser finds a HTML tag, it should notify me and pass the HTML tag, value, attributes etc. to a delegate. I cannot use NSXMLParser because I have messy HTML. Is there a useful library for that?
What I want to do is parse the HTML and create a NSAttributedArray and display it in a UITextView.
YES you can parse HTML content of file.
If you want to get specific value from HTML content you need to Parce HTML content by using Hpple. Also This is documentation with exmple that are is for parse HTML. Another way is rexeg but it is more complicated so this is best way in your case.

How to convert non-well formed html to xhtml on iOS

Does anyone know how to convert non-well formed html to XHTML/XML on iOS?
I want to use xpath to parse an HTML page, but is not a well formed XML/XHTML.
Do you really need to convert it, or just use xpath? https://github.com/topfunky/hpple or some other tools can parse non-well-formed html.

linqtoxml - insert string literal into xml file

I am using LINQ-to-XML. I am building a small program that helps parse HTML. I'd like to save the HTML tags into an XML file, but I don't want the XML file to check the validity of the entered HTML elements.
How can I just entere a simple string literal (a pretty long one)?
Maybe using a CDATA construct could help you out, see w3schools.com

How to sanitize user generated html code in ruby on rails

I am storing user generated html code in the database, but some of the codes are broken (without end tags), so when this code will mess up the whole render of the page.
How could I prevent this sort of behaviour with ruby on rails.
Thanks
It's not too hard to do this with a proper HTML parser like Nokogiri which can perform clean-up as part of the processing method:
bad_html = '<div><p><strong>bad</p>'
puts Nokogiri.fragment(bad_html).to_s
# <div><p><strong>bad</strong></p></div>
Once parsed properly, you should have fully balanced tags.
My google-fu reveals surprisingly few hits, but here is the top one :)
Valid Well-formed HTML
Try using the h() escape function in your erb templates to sanitize. That should do the trick
Check out Loofah, an HTML sanitization library based on Nokogiri. This will also remove potentially unsafe HTML that could inject malicious script or embed objects on the page. You should also scrub out style blocks, which might mess up the markup on the page.