Convert HTML string to plain string Objective-C - html

I made a UITableView which each cell contains a UILabel. The input is some HTML string (like a website page source), eg from Example Domain:
<body>
<div>
<h1>Example Domain</h1>
<p>This domain is established to be used for illustrative examples in documents. You may use this
domain in examples without prior coordination or asking for permission.</p>
<p>More information...</p>
</div>
</body>
but much larger, over 4000 characters.
I want to convert those HTML strings to plain string, for above example it will be:
Example Domain
This domain is established to be used for illustrative examples in documents. You may use this domain in examples without prior
coordination or asking for permission.
More information...
and then display them in UILabel. Currently I'm using this way:
NSData *htmlData = [htmlString dataUsingEncoding:NSUnicodeStringEncoding];
NSAttributedString *attrString = [[NSAttributedString alloc]
initWithData:HTMLData
options:#{
NSDocumentTypeDocumentAttribute : NSHTMLTextDocumentType
}
documentAttributes:nil
error:nil];
NSString *plainString = attrString.string;
This works fine, but the performance is very bad, which causes flickering when scrolling through the tableView. Is there any more efficient way to do this?

Use Below code:
NSString * htmlString = ...;
NSAttributedString * attrStr = [[NSAttributedString alloc] initWithData:[htmlString dataUsingEncoding:NSUnicodeStringEncoding] options:#{ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType } documentAttributes:nil error:nil];
UILabel * myLabel = [UILabel alloc] init];
myLabel.attributedText = attrStr;

Related

Cocoa webview iframe (objective c)

Good day!
I have an html page with iframe src.... in there. I need to get all HTML content of this frame to NSString.
Could you please advice the best way to do it without redirecting the webview to the src?
Thank you.
You can feed in an html text in an NSAttributedString and extract the string from there.
NSAttributedString * attrString = [[NSAttributedString alloc] initWithData:[htmlString dataUsingEncoding:NSUTF8StringEncoding] options:#{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: #(NSUTF8StringEncoding)} documentAttributes:nil error:nil];
NSString *finalString = [attrString string];

ios: Html parsing iframe tag issue

I am using the default html parser to parse the html text:
NSData *data = [receivedText dataUsingEncoding:NSUTF8StringEncoding];
NSMutableAttributedString *text = [[NSMutableAttributedString alloc] initWithData:data
options:#{
NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: #(NSUTF8StringEncoding)}
documentAttributes:nil error:nil];
But when the received html text contains a iframe tag, my app crashes. It gives a bad access code issue.
My html text is
<p dir="ltr">iFrame tag test<iframe src='http://www.test.com/'></iframe></p>
Is there something wrong in the code? App works fine when I replace NSHTMLTextDocumentType with any other type, but I need to use this type only.
I am using UITextView to display it.
Use UIWebView. iframe tags may not work with UITextView

Parsing a web page with TFHpple

I'm trying to write a very simple iOS app that will parse a webpage (http://arxiv.org/list/cond-mat/recent) and display a simplified version of it. I chose to use TFHpple to parse this page. I want to get titles of papers and display them in the TableViewController. The HTML container for paper descriptions looks like:
<div class="list-title">
<span class="descriptor">Title:</span> Encoding Complexity within Supramolecular Analogues of Frustrated Magnets
</div>
Function that I use to parse and get the values is the following (thanks to raywenderlich.com):
- (void) loadPapers{
NSURL *papersURL = [NSURL URLWithString:#"http://www.arxiv.org/list/cond-mat/recent"];
NSData *papersHTMLData = [NSData dataWithContentsOfURL:papersURL];
TFHpple *papersParser = [TFHpple hppleWithHTMLData:papersHTMLData];
NSString *papersXpathQueryString = #"//div[#class='list-title']";
NSArray *papersNodes = [papersParser searchWithXPathQuery:papersXpathQueryString];
NSMutableArray *newPapers = [[NSMutableArray alloc] initWithCapacity:0];
for (TFHppleElement *element in papersNodes){
Paper *paper = [[Paper alloc] init];
[newPapers addObject:paper];
paper.title = [[element firstChild] content];
}
_objects = newPapers;
[self.tableView reloadData];
}
This function is supposed to parse the entire HTML page and return data into TableView. However, when I try it returns empty objects into the paperNodes array. Basically, the number of the elements is correct (~25), but they're all empty and I am not sure why.
Any help is greatly appreciated! Thanks!
I have rewritten your code with HTMLKit. It looks like this:
NSURL *papersURL = [NSURL URLWithString:#"http://www.arxiv.org/list/cond-mat/recent"];
NSData *papersHTMLData = [NSData dataWithContentsOfURL:papersURL];
NSString *htmlString = [[NSString alloc] initWithData:papersHTMLData encoding:NSUTF8StringEncoding];
HTMLDocument *document = [HTMLDocument documentWithString:htmlString];
NSArray *divs = [document querySelectorAll:#"div[class='list-title']"];
for (HTMLElement *element in divs) {
NSLog(#"%#", element.textContent);
}
Back to your question in the comment:
Could you give some useful links that you find good to learn about HTMLKit?
You can check out the examples on the project's GitHub page. The source code is documented and using it is relatively straightforward. If you have basic HTML & CSS experience then using HTMLKit would be just as easy. Unfortunately there are no other resources it to learn it yet.
Probably the [element firstChild] is returning nil. I suggest you add some NSLog statements to track the data extraction and help you pinpoint the error.

Scan for images in website - Xcode

I am making an app which will give me the latest news, and the image. I achieve the text bit by making a scanner like this.
NSMutableURLRequest *request = [[NSMutableURLRequest alloc] init];
/* set headers, etc. on request if needed */
[request setURL:[NSURL URLWithString:#"http://stackoverflow.com/questions/22671347/nsuinteger-should-not-be-used-in-format-strings"]];
NSData *data = [NSURLConnection sendSynchronousRequest:request returningResponse:NULL error:NULL];
NSString *html = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
NSScanner *scanner = [NSScanner scannerWithString:html];
NSString *token = nil;
[scanner scanUpToString:#"<p>" intoString:NULL];
[scanner scanUpToString:#"</p>" intoString:&token];
int length = 3;
token = [token substringFromIndex:length];
textView.text = token;
Now I was wondering if I could use the same type of code to scan the website to find the first image and put it an image view. Also it don't have to be same type of code , post what ever you know and any method.
Summary is.
Want a piece of code that will scan a webpage, pick up the first image and place it in a image view.
Thanks for the people who take the time to help me.
THANKS AGAIN!!!
BYE!!!
NSScanner its not a HTML parser only intended for scanning values from NSString object. If you doing the odd scan you probably could get away with it, but it doesn't seem like...
The CORRECT approach is to use Libxml2 library included in Xcode which is only written is C which doesn't have any Objective-C/Swift wrapper. Libxml2 is the XML C parser and toolkit developed for the Gnome project. Alternatively i would recommend using open-source project such as HTMLReader. Its a HTML parser with CSS selectors in Objective-C and Foundation. It parses HTML just like a browser and is all written in Objective-c.
Example (using HTMLReader):
HTMLDocument *document = [HTMLDocument documentWithString:html]; // get your html string
NSLog(#"IMG: %#", [document firstNodeMatchingSelector:#"img"].textContent); // => image returned here
To find images just change the tag to < img > and your set!!
IF your using Libxml2 take a look at HTMLparser.c header file to parse and retrieve HTML ltags

Write HTML data to NSPasteboard

I am trying to copy an html data to clipboard on MAC. But when i check the clipboard using (Finder-> Edit Menu -> Show clipboard) it shows nothing. I want the formatted data to be pasted as it is. Below is the code i am using:
NSPasteboard *pb = [NSPasteboard generalPasteboard];
NSAttributedString *htmlString = [[NSAttributedString alloc]initWithString:#"<html><body><b>abcdefgh</b></body></html>"];
NSDictionary *documentAttributes = [NSDictionary dictionaryWithObjectsAndKeys:NSHTMLTextDocumentType, NSDocumentTypeDocumentAttribute, nil];
NSData *htmlData = [htmlString dataFromRange:NSMakeRange(0, htmlString.length) documentAttributes:documentAttributes error:NULL];
[pb declareTypes:[NSArray arrayWithObject:NSHTMLPboardType] owner:nil];
[pb setData:htmlData forType:NSHTMLPboardType];
I will appreciate any help. Thanks!
Update:
To get the html formatted string in clipboard, i tried converting it to Attributed string and then from attributed string to rtf data, it sets the data to clip board perfectly, but when i try to paste the data in html editor ( http://htmleditor.in/ ) it loses some formatting like colour.
NSPasteboard *pb = [NSPasteboard generalPasteboard];
NSString *htmlString = #"<HTML> <BODY> <P><FONT COLOR=\"RED\"><UL><LI>fhhj</LI><LI>juil</LI></UL><B>hello</B> <i>html</i></FONT></P> </BODY></HTML>";
NSDictionary *documentAttributes = [NSDictionary dictionaryWithObjectsAndKeys:NSHTMLTextDocumentType, NSDocumentTypeDocumentAttribute, NSCharacterEncodingDocumentAttribute,[NSNumber numberWithInt:NSUTF8StringEncoding], nil];
NSAttributedString* atr = [[NSAttributedString alloc] initWithData:[htmlString dataUsingEncoding:NSUTF8StringEncoding] options:documentAttributes documentAttributes:nil error:nil];
NSData *rtf = [atr RTFFromRange:NSMakeRange(0, [atr length])
documentAttributes:nil];
[pb declareTypes:[NSArray arrayWithObject:NSRTFPboardType] owner:nil];
[pb setData:rtf forType:NSRTFPboardType];
How to preserve color when pasting to html editor? Text is shown with color on clipboard then why it isn't pasted with color on html editor (http://htmleditor.in/)?
I finally got the issue, the functionality is browser dependent(i don't understand the reason though).
If i paste to the site in safari, it doesn't lose any formatting and works well, but if i paste the same text to same site on chrome, it loses some formatting.
Edit
If you want it copied with the attributes applied, then use
NSAttributedString *attrString = [[NSAttributedString alloc] initWithHTML:#"YOUR HTML CODE" documentAttributes:NULL];
[pb setData:attrString forType: NSPasteboardTypeRTF];