Parsing a web page with TFHpple - html

I'm trying to write a very simple iOS app that will parse a webpage (http://arxiv.org/list/cond-mat/recent) and display a simplified version of it. I chose to use TFHpple to parse this page. I want to get titles of papers and display them in the TableViewController. The HTML container for paper descriptions looks like:
<div class="list-title">
<span class="descriptor">Title:</span> Encoding Complexity within Supramolecular Analogues of Frustrated Magnets
</div>
Function that I use to parse and get the values is the following (thanks to raywenderlich.com):
- (void) loadPapers{
NSURL *papersURL = [NSURL URLWithString:#"http://www.arxiv.org/list/cond-mat/recent"];
NSData *papersHTMLData = [NSData dataWithContentsOfURL:papersURL];
TFHpple *papersParser = [TFHpple hppleWithHTMLData:papersHTMLData];
NSString *papersXpathQueryString = #"//div[#class='list-title']";
NSArray *papersNodes = [papersParser searchWithXPathQuery:papersXpathQueryString];
NSMutableArray *newPapers = [[NSMutableArray alloc] initWithCapacity:0];
for (TFHppleElement *element in papersNodes){
Paper *paper = [[Paper alloc] init];
[newPapers addObject:paper];
paper.title = [[element firstChild] content];
}
_objects = newPapers;
[self.tableView reloadData];
}
This function is supposed to parse the entire HTML page and return data into TableView. However, when I try it returns empty objects into the paperNodes array. Basically, the number of the elements is correct (~25), but they're all empty and I am not sure why.
Any help is greatly appreciated! Thanks!

I have rewritten your code with HTMLKit. It looks like this:
NSURL *papersURL = [NSURL URLWithString:#"http://www.arxiv.org/list/cond-mat/recent"];
NSData *papersHTMLData = [NSData dataWithContentsOfURL:papersURL];
NSString *htmlString = [[NSString alloc] initWithData:papersHTMLData encoding:NSUTF8StringEncoding];
HTMLDocument *document = [HTMLDocument documentWithString:htmlString];
NSArray *divs = [document querySelectorAll:#"div[class='list-title']"];
for (HTMLElement *element in divs) {
NSLog(#"%#", element.textContent);
}
Back to your question in the comment:
Could you give some useful links that you find good to learn about HTMLKit?
You can check out the examples on the project's GitHub page. The source code is documented and using it is relatively straightforward. If you have basic HTML & CSS experience then using HTMLKit would be just as easy. Unfortunately there are no other resources it to learn it yet.

Probably the [element firstChild] is returning nil. I suggest you add some NSLog statements to track the data extraction and help you pinpoint the error.

Related

Convert HTML string to plain string Objective-C

I made a UITableView which each cell contains a UILabel. The input is some HTML string (like a website page source), eg from Example Domain:
<body>
<div>
<h1>Example Domain</h1>
<p>This domain is established to be used for illustrative examples in documents. You may use this
domain in examples without prior coordination or asking for permission.</p>
<p>More information...</p>
</div>
</body>
but much larger, over 4000 characters.
I want to convert those HTML strings to plain string, for above example it will be:
Example Domain
This domain is established to be used for illustrative examples in documents. You may use this domain in examples without prior
coordination or asking for permission.
More information...
and then display them in UILabel. Currently I'm using this way:
NSData *htmlData = [htmlString dataUsingEncoding:NSUnicodeStringEncoding];
NSAttributedString *attrString = [[NSAttributedString alloc]
initWithData:HTMLData
options:#{
NSDocumentTypeDocumentAttribute : NSHTMLTextDocumentType
}
documentAttributes:nil
error:nil];
NSString *plainString = attrString.string;
This works fine, but the performance is very bad, which causes flickering when scrolling through the tableView. Is there any more efficient way to do this?
Use Below code:
NSString * htmlString = ...;
NSAttributedString * attrStr = [[NSAttributedString alloc] initWithData:[htmlString dataUsingEncoding:NSUnicodeStringEncoding] options:#{ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType } documentAttributes:nil error:nil];
UILabel * myLabel = [UILabel alloc] init];
myLabel.attributedText = attrStr;

Scan for images in website - Xcode

I am making an app which will give me the latest news, and the image. I achieve the text bit by making a scanner like this.
NSMutableURLRequest *request = [[NSMutableURLRequest alloc] init];
/* set headers, etc. on request if needed */
[request setURL:[NSURL URLWithString:#"http://stackoverflow.com/questions/22671347/nsuinteger-should-not-be-used-in-format-strings"]];
NSData *data = [NSURLConnection sendSynchronousRequest:request returningResponse:NULL error:NULL];
NSString *html = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
NSScanner *scanner = [NSScanner scannerWithString:html];
NSString *token = nil;
[scanner scanUpToString:#"<p>" intoString:NULL];
[scanner scanUpToString:#"</p>" intoString:&token];
int length = 3;
token = [token substringFromIndex:length];
textView.text = token;
Now I was wondering if I could use the same type of code to scan the website to find the first image and put it an image view. Also it don't have to be same type of code , post what ever you know and any method.
Summary is.
Want a piece of code that will scan a webpage, pick up the first image and place it in a image view.
Thanks for the people who take the time to help me.
THANKS AGAIN!!!
BYE!!!
NSScanner its not a HTML parser only intended for scanning values from NSString object. If you doing the odd scan you probably could get away with it, but it doesn't seem like...
The CORRECT approach is to use Libxml2 library included in Xcode which is only written is C which doesn't have any Objective-C/Swift wrapper. Libxml2 is the XML C parser and toolkit developed for the Gnome project. Alternatively i would recommend using open-source project such as HTMLReader. Its a HTML parser with CSS selectors in Objective-C and Foundation. It parses HTML just like a browser and is all written in Objective-c.
Example (using HTMLReader):
HTMLDocument *document = [HTMLDocument documentWithString:html]; // get your html string
NSLog(#"IMG: %#", [document firstNodeMatchingSelector:#"img"].textContent); // => image returned here
To find images just change the tag to < img > and your set!!
IF your using Libxml2 take a look at HTMLparser.c header file to parse and retrieve HTML ltags

iOS - Showing App version in HTML page

I'm showing About in form of html in my iOS application. About.html is also bundled along with the application.
I want to show the application version in About html page automatically so that I dont have to edit the HTML manually everytime I bump the version.
Currently what I'm doing is as below:
I'm creating the html as <b>Version %#</b>
In Objective C code, I'm writing it as
NSString* aboutFilePath = [[NSBundle mainBundle] pathForResource:#"About" ofType:#"html"];
NSString* htmlStr = [NSString alloc] initWithContentsOfFile:aboutFilePath encoding:NSUTF8StringEncoding error:nil];
NSString* formattedStr = [NSString stringWithFormat:htmlStr, [self appNameAndVersionNumberDisplayString];
[self.webView loadHTMLString:formattedStr baseURL:nil];
- (NSString *)appNameAndVersionNumberDisplayString {
NSDictionary *infoDictionary = [[NSBundle mainBundle] infoDictionary];
NSString *appDisplayName = [infoDictionary objectForKey:#"CFBundleDisplayName"];
NSString *majorVersion = [infoDictionary objectForKey:#"CFBundleShortVersionString"];
NSString *minorVersion = [infoDictionary objectForKey:#"CFBundleVersion"];
return [NSString stringWithFormat:#"%#, Version %# (%#)",
appDisplayName, majorVersion, minorVersion];
}
Is it good way to do it or is there any better way to achieve it?
You are doing it in near to perfect manner. There is no problem in proceeding with your approach.
That's ok. If you want to localise the text in the future then that might change the order of the parameters in the text so using something more like SUB_VERSION as the identifier to be replaced and then using the string replacement methods (instead of format method) would be better.

Unable to load html string in UIWebView using loadHTMLString:baseURL in iOS?

I am trying to embed youtube video into my iOS application.For that I have created a UIWebView & trying to load the Youtube video from following here
I have gone through the all the answers for the above problem. Even then its not working.
I have also tried loading very simple HTML
NSString *embedHTML =[NSString stringWithFormat:#"<html><body>Hello World</body></html>"];
[webView loadHTMLString:embedHTML baseURL:nil];
Even then, I am getting compile error Parse Issue Expecte ']'
I have tried cleaning, quitting the XCode & relaunching it again. I donno, I am not able to use that method. How to use the above loadHTMLString method for my UIWebView.
PS : Please do not tag this question as duplicate. I have tried all the solutions in Stackoverflow. Nothing has worked
WebView *webDesc = [[UIWebView alloc]initWithFrame:CGRectMake(12, 50, 276, 228)];
NSString *embedHTML = #"<html><head></head><body><p>1. You agree that you will be the technician servicing this work order?.<br>2. You are comfortable with the scope of work on this work order?.<br>3. You understand that if you go to site and fail to do quality repair for any reason, you will not be paid?.<br>4. You must dress business casual when going on the work order.</p></body></html>";
webDesc.userInteractionEnabled = NO;
webDesc.opaque = NO;
webDesc.backgroundColor = [UIColor clearColor];
[webDesc loadHTMLString: embedHTML baseURL: nil];
- (NSString *)getHTMLContent
{
NSString *cssPath = [[NSBundle mainBundle] pathForResource:#"baseline" ofType:#"css"];
NSData *cssData = [NSData dataWithContentsOfFile:cssPath];
NSString *cssStyle = [[NSString alloc] initWithData:cssData encoding:NSASCIIStringEncoding];
NSDateFormatter *dateFormatter = [[NSDateFormatter alloc] init];
[dateFormatter setDateStyle:NSDateFormatterMediumStyle];
NSString *subtitle = [NSString stringWithFormat:#"%# | %#", self.article.author, [dateFormatter stringFromDate:self.article.publishedDate]];
NSString *htmlString = [NSString stringWithFormat:#"<html><head><meta name='viewport' content='width=device-width; initial-scale=1.0; maximum-scale=1.0;'></head><style type=\"text/css\">%#</style><body><div id=\"container\"><h1>%#</h1><p class='subtitle'>%#</p>%#</div></body></html>", cssStyle, self.article.title, subtitle, self.article.content];
return htmlString;
}
It is very simple. You just have to add only one line. Try It:
NSString *htmlString = #"<html><head></head><body><p>1. You agree that you will be the technician servicing this work order?.<br>2. You are comfortable with the scope of work on this work order?.<br>3. You understand that if you go to site and fail to do quality repair for any reason, you will not be paid?.<br>4. You must dress business casual when going on the work order.</p></body></html>";
[WebView loadHTMLString: htmlString baseURL: nil];
You probably need to provide more code if you want people to help you. I just used webView in a similar way and it's working fine.
self.webView = [[UIWebView alloc] initWithFrame:myFrame];
self.webView.scalesPageToFit = YES;
[self.webView setBackgroundColor:[UIColor whiteColor]];
//pass the string to the webview
[self.webView loadHTMLString:[[self.itineraryArray objectAtIndex:indexPath.row] valueForKey:#"body"] baseURL:nil];
//add it to the subview
[self.view addSubview:self.webView];
Can you provide more information and code?
it doesnt seems like an issue on webview.Please do add the code where it breaks.Error says you missed a ] somewhere in the code
try to load the url directly into the webview :
UIWebView *webview=[[UIWebView alloc]initWithFrame:CGRectMake(100, 100, 200, 250)];
[webview loadRequest:[NSURLRequest requestWithURL:[NSURL URLWithString:#"http://youtube.com/embed/-0Xa4bHcJu8"]]];

How would you create a new local HTML file into a UIWebView on iOS?

First I would like to point out I'm very new to web development and more comfortable with iOS development, so excuse me if there's something fundamental I'm not understanding.
I've seen how you can place an HTML file into your apps directory and load it into a web view (Example). This is great, but how can you create a new local HTML file within the app? Such that the user can create a new html file to type in, and then store it (basic document style app functionality). Perhaps with some sort of Javascript (I'm not too familiar with such Javascript)?
You can build the HTML in a NSString like so:
// get user input
NSString *userText = #"Hello, world!";
// build the HTML
NSString *html = [NSString stringWithFormat:#"<html><body>%#</body><html>", userText];
// build the path where you're going to save the HTML
NSString *docsFolder = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES)[0];
NSString *filename = [docsFolder stringByAppendingPathComponent:#"sample.html"];
// save the NSString that contains the HTML to a file
NSError *error;
[html writeToFile:filename atomically:NO encoding:NSUTF8StringEncoding error:&error];
Clearly I'm just setting the userText to some literal, but you could obviously let the user type it into a UITextView and grab it from there.
You can then load the HTML into a web view with:
[self.webView loadHTMLString:html baseURL:nil];
or with:
NSURL *url = [NSURL fileURLWithPath:filename];
NSURLRequest *request = [NSURLRequest requestWithURL:url];
[self.webView loadRequest:request];