Objective C Using NSScanner to obtain <body> from html - html

I am trying to create an iOS app simply to extract the section of a web page.
I have the code working to connect to the URL and store the HTML in an NSString
I have tried this, but I am just getting null strings for my result
NSScanner* newScanner = [NSScanner scannerWithString:htmlData];
// Create a new scanner and give it the html data to parse.
while (![newScanner isAtEnd])
{
[newScanner scanUpToString:#"<body>" intoString:NULL];
// Scam until <body> tag is found
[newScanner scanUpToString:#"</body>" intoString:&bodyText];
// Everything up to the end tag will get placed into the memory address of the result string
}
I have tried an alternative way...
NSScanner* newScanner = [NSScanner scannerWithString:htmlData];
// Create a new scanner and give it the html data to parse.
while (![newScanner isAtEnd])
{
[newScanner scanUpToString:#"<body" intoString:NULL];
// Scam until <body> tag is found
[newScanner scanUpToString:#">" intoString:NULL];
// Go to end of opening <body> tag
[newScanner scanUpToString:#"</body>" intoString:&bodyText];
// Everything up to the end tag will get placed into the memory address of the result string
}
This second way returns a string which starts with >< script... etc
If Im honest I don't have a good URL to test this with and I think It may be easier with some help on removing the tags within the body too (like <p></p>)
Any help would be very much appriciated

I don't know why your first method didn't work. I assume you defined bodyText before that snippet. This code worked fine for me,
- (void)viewDidLoad {
[super viewDidLoad];
NSString *htmlData = #"This is some stuff before <body> this is the body </body> with some more stuff";
NSScanner* newScanner = [NSScanner scannerWithString:htmlData];
NSString *bodyText;
while (![newScanner isAtEnd]) {
[newScanner scanUpToString:#"<body>" intoString:NULL];
[newScanner scanString:#"<body>" intoString:NULL];
[newScanner scanUpToString:#"</body>" intoString:&bodyText];
}
NSLog(#"%#",bodyText); // 2015-01-28 15:58:00.360 ScanningOfHTMLProblem[1373:661934] this is the body
}
Notice that I added a call to scanString:intoString: to get past the first "<body>".

Related

Swift Retrieve HTML data from Webview

I am trying to obtain the data within the header of a web page that is being displayed in a UIWebView.
How do I get the raw (unformatted) HTML string from the UIWebView?
Also, I'm using iOS 9.
My question is similar to Reading HTML content from a UIWebView , but this post is from 6 years ago.
From the top answer on the question you linked:
NSString *html = [yourWebView stringByEvaluatingJavaScriptFromString:
#"document.body.innerHTML"];
would translate into Swift:
let html = yourWebView.stringByEvaluatingJavaScriptFromString("document.body.innerHTML")
stringByEvaluatingJavaScriptFromString returns a optional, so you'd probably want to later use an if let statement:
if let page = html {
// Do stuff with the now unwrapped "page" string
}
swift 4:
if let html = self.webView.stringByEvaluatingJavaScript(from: "document.body.innerHTML"){
}
Don't forgot get the html when the page render finished, if you got html too early, the result will be empty.
func webViewDidFinishLoad(webView: UIWebView) {
print("pageDidFinished")
if let html = webView.stringByEvaluatingJavaScriptFromString("document.documentElement.outerHTML") {
print("html=[\(html)]")
}
}

UIWebView ignores inline anchor/link

I'm using a UIWebViewthat loads HTML from a database string using webView.loadHTMLString(self.htmlContent, baseURL: nil)
The htmlContent contains the following:
<ul class="anchorNavigation">
<li>
1. Inline Test Link
</li>
<li>
2. Inline Test Link
</li>
...
</ul>
... and later in the HTML:
...
...
However, whenever I click the inline link in the webView nothing happens.
What I've tried so far:
Changing the anchor tag to 'real' valid W3C HTML. E.g. <a id='parsys_47728'>Test</a>
Saving the HTML to a file in the temp directory and loading it using loadRequest(). E.g. let path = tempDirectory.URLByAppendingPathComponent("content.html") and webView.loadRequest(NSURLRequest(URL: path))
Intercepting the loadRequest method by implementing the func webView(webView: UIWebView, shouldStartLoadWithRequest request: NSURLRequest, navigationType: UIWebViewNavigationType) -> Bool delegate. The request.URL says something strange like: "applewebdata://1D9D74C2-BBB4-422F-97A7-554BCCD0055A#parsys_47728"
I don't have any idea anymore how to achieve this. I know from previous projects that local HTML files in the bundle work with inline links. I just cannot figure out why this doesn't work.
Help much appreciated! Thank you!
If there's a fragment (e.g., #anchorName), then use JavaScript to scroll. Otherwise, assume it's a link without a fragment and use openURL.
// UIWebViewDelegate
- (BOOL)webView:(UIWebView *)webView shouldStartLoadWithRequest:(NSURLRequest *)request navigationType:(UIWebViewNavigationType)navigationType
{
if (navigationType == UIWebViewNavigationTypeLinkClicked ) {
// NSLog(#"request.URL %#",request.URL); // e.g., file:///.../myApp.app/#anchorName
NSString *anchorName = request.URL.fragment; // e,g, "anchorName"
if ( anchorName ) {
[webView stringByEvaluatingJavaScriptFromString:[NSString swf:#"window.location.hash='%#';",anchorName]];
return NO;
} else { // assume http://
[[UIApplication sharedApplication] openURL:[request URL]];
return NO;
}
}
return YES;
}
I'm still looking for a way to have the scroll position change smoothly (animated) rather than jumping.

Populate a hyperlink attributed string for UIActivityViewController activities

There are lots of answers on SO that show devs how to make a string from HTML content or place a URL in a string, but my question is how to make an HTML string.
I'm trying to create a string that will return in HTML format or at least not show the URL.
So for example web devs would do this to hide the URL:
Visit Us at Google.com!
I can easily translate that to a string by doing so:
NSString *urlLink = #"www.google.com";
NSString *string = [NSString stringWithFormat:#"Visit Us at %#", urlLink];
But that doesn't replace the link with a hyperlink word of my choosing.
I'm aware that the device dictates if its a hyperlink depending on how you display it. i.e., text fields, text views, or you can force open it etc.
What i'm trying to do is:
#define APPSTORELINK #"www.appstorelink.com"
#implementation Config
+(NSString *)appstorelink {
return APPSTORELINK;
}
+(NSString *)mmsmetadata {
NSString *string = [NSString stringWithFormat:#"I shared this publication with the [Name of my iPhone App] iPhone App", APPSTORELINK];
return string;
}
So I can easily call it here or app wide:
NSArray *shareItems;
UIImage *snapshot = [self imageFromView:self.view];
shareItems = #[[Config mmsmetadata], snapshot];
UIActivityViewController *activityController = [[UIActivityViewController alloc] initWithActivityItems:shareItems applicationActivities:nil];
activityController.excludedActivityTypes = #[UIActivityTypePostToFlickr, UIActivityTypeAssignToContact, UIActivityTypeMail, UIActivityTypePostToVimeo];
[activityController setCompletionWithItemsHandler:(UIActivityViewControllerCompletionWithItemsHandler)^(NSString *string, BOOL completed) {
So in short, how can I make the string HTML format out of the box? My main concern is I want to hide the URL and replace it with an HTML tag, or otherwise if you have a better solution. Can't find anything on SO.
Any thoughts? I'm probably overthinking this. I'm sure theres an easier way
EDIT
Before even posting I have been aware of NSAttributedString and that was the first thing I attempted. However, the issue isn't setting an attribute, thats the easy part, the part that is defining my question is how to set it so it will DISPLAY as attributed when using activities in the UIActivityViewController
Here is how I set it, but the outcome was the same as above so I figured it would be easier to use an HTML tag:
NSMutableAttributedString *string = [[NSMutableAttributedString alloc] initWithString:#"I shared this publication with the Army Pubs iPhone App!"];
NSRange selectedRange = NSMakeRange(0, [string length]);
NSURL *linkURL = [NSURL URLWithString:APPSTORELINK];
[string beginEditing];
[string addAttribute:NSLinkAttributeName
value:linkURL
range:selectedRange];
[string addAttribute:NSForegroundColorAttributeName
value:[UIColor blueColor]
range:selectedRange];
[string addAttribute:NSUnderlineStyleAttributeName
value:[NSNumber numberWithInt:NSUnderlineStyleSingle]
range:selectedRange];
[string endEditing];
return string;
However, it still displays as plain text in the Message or Mail composers. So think MFMailComposeViewControllerDelegate how there is a setting for isHTML. If it's set to yes it strips all the HTML tags and displays the text as a hyperlink. For example:
MFMailComposeViewController *mc = [[MFMailComposeViewController alloc] init];
mc.mailComposeDelegate = self;
[mc setSubject:emailTitle];
[mc setMessageBody:messageBody isHTML:YES];
I want to emulate that when the activities are called from within a UIActivityViewController
This is the output currently even if I do it with the attributed string I tried first it just displays as plain text by stripping the HTML tag but doesn't make it a link
See link option in attributed strings.
The link attribute specifies an arbitrary object that is passed to the NSTextView method
clickedOnLink:atIndex: when the user clicks in the text range
associated with the NSLinkAttributeName attribute. The text view’s
delegate object can implement textView:clickedOnLink:atIndex: or
textView:clickedOnLink: to process the link object. Otherwise, the
default implementation checks whether the link object is an NSURL
object and, if so, opens it in the URL’s default application.

MFMailComposeViewController crashes on iOS5 when setting HTML body

I'm totally baffled. I am trying to let the user send an email via MFMailComposeViewController. However, when I set up my MFMailComposeViewController with a HTML body, the MFMailComposeViewController displays the email correctly, but on sending the mail my application crashes with a log like this:
2011-12-03 15:04:08.708 Kinopilot[58387:17503] *** Terminating app due to uncaught
exception 'DOMException', reason: '*** INVALID_ACCESS_ERR: DOMException 15'
*** First throw call stack:
(0x2994052 0x210fd0a 0x2993f11 0x391affb 0x3886795 0x64a9b4 0x6487b6
0x648b84 0x648bb4 0x648bb4 0x6424b9 0x69078b 0x658fad 0x658eeb 0x658e6f
0x656d37 0x656bb2 0x65737c 0x65759e 0x658273 0x658758 0x6588e6 0x2995ec9
0x10965c2 0x12d1d54 0x2995ec9 0x10965c2 0x109655a 0x113bb76 0x113c03f
0x113b2fe 0x10bba30 0x10bbc56 0x10a2384 0x1095aa9 0x2270fa9 0x29681c5
0x28cd022 0x28cb90a 0x28cadb4 0x28caccb 0x226f879 0x226f93e 0x1093a9b 0x6680d 0x663d5)
I am pretty confident that the MFMailComposeViewController is wired correctly, because all is fine when I set the body as plain text. Besides, my code works fine w/HTML and plain text on iOS 4.3.
This is the relevant code:
#interface MailComposeDelegate: NSObject<MFMailComposeViewControllerDelegate>
#end
#implementation MailComposeDelegate
+(MailComposeDelegate*) sharedMailComposeDelegate
{
static MailComposeDelegate* mcd = nil;
if(!mcd) {
mcd = [[MailComposeDelegate alloc]init];
}
return mcd;
}
-(void)mailComposeController:(MFMailComposeViewController*)controller
didFinishWithResult:(MFMailComposeResult)result
error:(NSError*)error
{
[app.window.rootViewController dismissModalViewControllerAnimated:NO];
}
...
#implementation M3AppDelegate(Email)
-(void)composeEmailWithSubject: (NSString*)subject
andBody: (NSString*)body
{
// -- compose HTML message.
MFMailComposeViewController* mc;
if([MFMailComposeViewController canSendMail])
mc = [[MFMailComposeViewController alloc] init];
if(!mc) return;
mc.mailComposeDelegate = [MailComposeDelegate sharedMailComposeDelegate];
[mc setSubject: subject];
[mc setMessageBody: body isHTML: YES];
// -- show message composer.
[app.window.rootViewController presentModalViewController:mc animated:YES];
}
#end
Ha, answering my own question here: some iOS5-component wouldn't process some of my markup, namely a
<div style='font-size: 85%'>
div. Changing the relative size into an absolute pixel height worked fine, though.

How can I get name from link?

I'm writing on objective-C.
I have WebView and local file index.html has
<a href='http://www.google.com' name="666">
How can I get the name attribute?
Thanks!
It depends on when/by what you need to get the name. If you need the name when someone clicks on the link, you can set up some JavaScript that runs when the link is clicked (onclick handler). If you just have the html string, you can use regular expressions to parse the document and pull out all of the name attributes. A good regular expression library for Objective-C is RegexKit (or RegexKitLite on the same page).
The regular expression for parsing the name attribute out of a link would look something like this:
/<a[^>]+?name="?([^" >]*)"?>/i
EDIT: The javascript for getting a name out of a link when someone clicked it would look something like this:
function getNameAttribute(element) {
alert(element.name); //Or do something else with the name, `element.name` contains the value of the name attribute.
}
This would be called from the onclick handler, something like:
My Link
If you need to get the name back to your Objective-C code, you could write your onclick function to append the name attribute to the url in the form of a hashtag, and then trap the request and parse it in your UIWebView delegate's -webView:shouldStartLoadWithRequest:navigationType: method. That would go something like this:
function getNameAttribute(element) {
element.href += '#'+element.name;
}
//Then in your delegate's .m file
- (BOOL)webView:(UIWebView *)webView
shouldStartLoadWithRequest:(NSURLRequest *)request
navigationType:(UIWebViewNavigationType)navigationType {
NSArray *urlParts = [[request URL] componentsSeparatedByString:#"#"];
NSString *url = [urlParts objectAtIndex:0];
NSString *name = [urlParts lastObject];
if([url isEqualToString:#"http://www.google.com/"]){
//Do something with `name`
}
return FALSE; //Or TRUE if you want to follow the link
}