Extracting images from HTML with HTMLReader library on Objective-C - html

I am an Objective C beginner and currently building an iOS app that extracts images from HTML pages just like Pinterest.
By searching through stack overflow, I found HTMLReader was recommended by many to parse HTML, so I installed it.
However, there is not so much of documents on the web that explain about how to use this library except for the sample code on the github page.
Could anyone advise me of how to extract image URLs from web pages using HTMLReader?
Below is what I tried.
Objective-C Code
NSString * htmlBody = [_webView stringByEvaluatingJavaScriptFromString:
#"document.body.innerHTML"];
HTMLDocument *document = [HTMLDocument documentWithString:htmlBody];
NSLog(#"%#", [document firstNodeMatchingSelector:#"img src"].textContent);
Sample web URL I tried to parse
http://www.lifehack.org/articles/lifestyle/100-life-hacks-that-make-life-easier.html
Expected Outcome on the console
http://cdn-media-1.lifehack.org/wp-content/files/2014/09/US_map_-_states-370x208.png
(since it is the first image that comes up when I search for "src img" on this site)
Actual Outcome I got
(Null)

You have to do
HTMLElement *element = [document firstNodeMatchingSelector:#"img"];
To get the image url, you just have to access the HTMLElement's attributes NSDictionary, and get the object for key "src":
NSString *urlString = element.attributes["src"];

Related

iOS 9 "Save PDF to iBooks" with HTML

iOS 9 has a new built-in UIActivity of UIActivityTypeOpenInIBooks. It's in the default list of activities in the UI and in header file but I've not been able to find any API documentation for it (yet).
It appears that UIActivityTypeOpenInIBooks will create a PDF in iBooks just fine if you include UIImages in your UIActivityViewController items or return them in your UIActivityItemProvider.
However I'd like to create a PDF with an HTML page containing text and images like Safari does. But I can't seem to find a way to pass the HTML to UIActivityTypeOpenInIBooks in an acceptable way.
I've tried passing the HTML as a String, NSData and NSURL of a file. I've also tried returning an UIMarkupTextPrintFormatter which works fine for printing but not for UIActivityTypeOpenInIBooks.
Providing an HTML string or the UIMarkupTextPrintFormatter both result in the following errors:
2015-09-08 11:35:46.392 MyApp[4599:1492484] ERROR: attempting to save to URL with no printing source (formatter/renderer) set
2015-09-08 11:35:46.393 MyApp[4599:1492484] FAILED! due to error in domain UIPrintErrorDomain with error code 4
Has anyone gotten this working?
You can convert the html to pdf before saving it to ibooks.

How do I display a jpeg image stored in a core data database using html

I have several images soared in a core data database in this way. The entity is named note.
NSData *imageData = UIImageJPEGRepresentation(image, 0.5);
image = nil; // free memory
[self createNote];
note.photo_jpeg = imageData;
How do I reference the images in html generated for a web page to display several of these images? I think I need something like this, but I don't know what to put in the IMG SRC=...
NSString *imageHtml = [NSString stringWithFormat:#"<IMG SRC="what do i put here!!!" ALT="Photo" WIDTH=%i HEIGHT=%i>", , )];
[html appendString:imageHtml];
Update This is the solution I used:
[html appendFormat:#"<img alt=\"Embedded Image\" src=\"data:image/jpg;base64,%#\" WIDTH=400 />", [currentNote.photo_jpeg base64EncodedStringWithOptions:0]];
Where currentNote is of type note and is indexed through the notes I am displaying.
You'll need to put the image data inline with the HTML, encoded as base64.
Something like this:
NSData *imageData = // from your code
NSMutableString *html = // mutable string with whatever else you need
[html appendFormat:#"<img alt=\"Embedded Image\" src=\"data:image/jpg;base64,%#\" />", [imageData base64EncodedStringWithOptions:0]];
Keep in mind that this duplicates the image data, so if you're using a lot of images this way, make sure to watch how much memory you're using.
NSData have required methode to get base64 string.
img src="data:image/jpg;base64,HereBase64RepresentationOfYourJPG"
I do not think there is a way to have HTML dip into core data.
I would create a sub-directory for your core data store. In that subdirectory, create a sibling directory to hold your images.
You can store the images as plain jpeg files, and keep the path or bookmark-url to the file in your core data object.
This way, you can still access everything via core data, and the file is available to the HTML as well.
Just include the path to the file as part of the html.

iOS Associate URL with Saved File

My app parses an xml, and builds its own custom HTML from the contents of the article chosen in the XML. When I save an article, I have a class for the action, in which I pass the article title, and the custom HTML to strings within the Save class. The class takes that and saves it to the app using:
NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDirectory = [paths objectAtIndex:0];
NSString *pdfPath = [documentsDirectory stringByAppendingPathComponent:[thetitle stringByAppendingString:#".html"]];
NSError *error = nil;
[thehtmlcontents writeToFile:pdfPath atomically:YES encoding:NSUTF8StringEncoding error:&error];
The issue that I have is that if I want to share a saved article via Facebook or Twitter, I can't, because the URL doesn't get saved with everything else. I can pass over the URL easy enough to the Save class, but I'm unsure of what to do with it, so that it stays associated with the article itself. Suggestions?
I'd say you broadly have three options:
Attach some metadata to the file noting the URL it corresponds to
Write out a file format that encapsulates the URL, plus the HTML
Include the URL in the HTML in a manner such that you can retrieve it
no. 1 would probably be best achieved by setting an extended attribute on the file. However, I'm not sure how well iOS supports this, and there may well be issues with it not being preserved in the event of something like restoring the OS.
Are you in a position to implement no. 3 reasonably cleanly? I would say a <meta> tag near the top of the document is best for doing this.
All that said, how important really is it that your HTML is stored in files? To me, this sounds like it could easily be chucked into a dirt simple Core Data database.

Convert a HTML page to a iOS View

i want to take a normal html page (a page from tf2 wiki) and convert it into a view.
For example it should take a text and put it into a label or a Image into an UIImageView.
Is there any Framework to do something like that or should i convert the site on a webserver
to make it easier for the app at runtime to render ?
Hope you understand, what i want to do.
Tim
there could be different ways depending on your scenario, you could load them in UIWebView within IOS, If you don't want show the UIWebView then you can get the html from it ans use it however you want. or other way you could transfer the HTML as json data to IOS and then show that data via UIView, UILabel or anything else. otherwise as suggested by Zakaria, you could use PhoneGap.
UPDATE with sample code for sending a request and getting its contents if you don't want to use UIWebView
//your html page with url
NSURL *url = [NSURL URLWithString:#"www.google.com"];
NSURLRequest *request=[NSURLRequest requestWithURL:url cachePolicy:NSURLRequestReloadIgnoringLocalAndRemoteCacheData timeoutInterval:30.0f];
NSOperationQueue *queue=[[NSOperationQueue alloc]init];
[NSURLConnection sendAsynchronousRequest:request queue:queue completionHandler:^(NSURLResponse *response, NSData *data, NSError *error) {
if ([data length]>0 && error==nil) {
NSString *html=[[NSString alloc]initWithData:data encoding:NSUTF8StringEncoding];
NSLog(#"this string contains all html page you have \n %#",html);
}
}
];
Not sure if i understand the question properly, but Cordova lets you develope iOS apps in html/css/js.
Check their website: http://cordova.apache.org/
If your page is a static one, then use PhoneGap : It will embed your HTML+JS+CSS in an iOS app and will be loaded in a webview.
But there is no framework that will allow you to convert your HTML elements into native ones.

How can I pull HTML code into an NSString from an iOS app?

I am trying to pull down the code from an HTML website that has no more than 2 lines on it. The code contains a word that I need to retrieve. Is there a simple way to pull down that code and put it in an NSString?
Further details: I am going to have an app that checks for a word on a page. If that word is what I am looking for, the app will show the text "confirmed". The purpose of the app is to check to see if the page is accessible.
If you need a http library to hit the server try asihttp. Apart from this i need more info of what you are trying to do...
If you just want to check if that website is reachable, you can go with HTTP Success Status Codes.
Using ASIHTTPRequest simplifies communication over the web.
If you still want to evaluate the text on that website, can also just retrieve it using:
[request responseString];
Depending on what you get from the website, it's up to you how to update the UI.
Just change the link between the quotes and it'll work!
-(void) viewDidLoad {
NSString * sFeedURL = [NSString stringWithFormat:#"http://www.google.com/ig/api?weather=,,,270000,960000"];
//RSS Feed URL goes between quotes
NSString * sActualFeed = [NSString stringWithContentsOfURL:[NSURL URLWithString:sFeedURL] encoding:1 error:nil];
NSLog(#"%#", sActualFeed);
}