Converting HTML text into plain text using Objective-C - html

I have huge NSString with HTML text inside. The length of this string is more then 3.500.000 characters. How can i convert this HTML text to NSString with plain text inside. I was using scanner , but it works too slowly. Any idea ?

It depends what iOS version you are targeting. Since iOS7 there is a built-in method that will not only strip the HTML tags, but also put the formatting to the string:
Xcode 9/Swift 4
if let htmlStringData = htmlString.data(using: .utf8), let attributedString = try? NSAttributedString(data: htmlStringData, options: [.documentType : NSAttributedString.DocumentType.html], documentAttributes: nil) {
print(attributedString)
}
You can even create an extension like this:
extension String {
var htmlToAttributedString: NSAttributedString? {
guard let data = self.data(using: .utf8) else {
return nil
}
do {
return try NSAttributedString(data: data, options: [.documentType : NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil)
} catch {
print("Cannot convert html string to attributed string: \(error)")
return nil
}
}
}
Note that this sample code is using UTF8 encoding. You can even create a function instead of computed property and add the encoding as a parameter.
Swift 3
let attributedString = try NSAttributedString(data: htmlString.dataUsingEncoding(NSUTF8StringEncoding)!,
options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType],
documentAttributes: nil)
Objective-C
[[NSAttributedString alloc] initWithData:[htmlString dataUsingEncoding:NSUTF8StringEncoding] options:#{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: [NSNumber numberWithInt:NSUTF8StringEncoding]} documentAttributes:nil error:nil];
If you just need to remove everything between < and > (dirty way!!!), which might be problematic if you have these characters in the string, use this:
- (NSString *)stringByStrippingHTML {
NSRange r;
NSString *s = [[self copy] autorelease];
while ((r = [s rangeOfString:#"<[^>]+>" options:NSRegularExpressionSearch]).location != NSNotFound)
s = [s stringByReplacingCharactersInRange:r withString:#""];
return s;
}

I resolve my question with scanner, but i use it not for all the text. I use it for every 10.000 text part, before i concatenate all parts together. My code below
-(NSString *)convertHTML:(NSString *)html {
NSScanner *myScanner;
NSString *text = nil;
myScanner = [NSScanner scannerWithString:html];
while ([myScanner isAtEnd] == NO) {
[myScanner scanUpToString:#"<" intoString:NULL] ;
[myScanner scanUpToString:#">" intoString:&text] ;
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:#"%#>", text] withString:#""];
}
//
html = [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
return html;
}
Swift 4:
var htmlToString(html:String) -> String {
var htmlStr =html;
let scanner:Scanner = Scanner(string: htmlStr);
var text:NSString? = nil;
while scanner.isAtEnd == false {
scanner.scanUpTo("<", into: nil);
scanner.scanUpTo(">", into: &text);
htmlStr = htmlStr.replacingOccurrences(of: "\(text ?? "")>", with: "");
}
htmlStr = htmlStr.trimmingCharacters(in: CharacterSet.whitespacesAndNewlines);
return htmlStr;
}

Objective C
+ (NSString*)textToHtml:(NSString*)htmlString
{
htmlString = [htmlString stringByReplacingOccurrencesOfString:#""" withString:#"\""];
htmlString = [htmlString stringByReplacingOccurrencesOfString:#"&apos;" withString:#"'"];
htmlString = [htmlString stringByReplacingOccurrencesOfString:#"&" withString:#"&"];
htmlString = [htmlString stringByReplacingOccurrencesOfString:#"<" withString:#"<"];
htmlString = [htmlString stringByReplacingOccurrencesOfString:#">" withString:#">"];
return htmlString;
}
Hope this helps!

For Swift Language ,
NSAttributedString(data:(htmlString as! String).dataUsingEncoding(NSUTF8StringEncoding, allowLossyConversion: true
)!, options:[NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: NSNumber(unsignedLong: NSUTF8StringEncoding)], documentAttributes: nil, error: nil)!

- (NSString *)stringByStrippingHTML:(NSString *)inputString
{
NSMutableString *outString;
if (inputString)
{
outString = [[NSMutableString alloc] initWithString:inputString];
if ([inputString length] > 0)
{
NSRange r;
while ((r = [outString rangeOfString:#"<[^>]+>| " options:NSRegularExpressionSearch]).location != NSNotFound)
{
[outString deleteCharactersInRange:r];
}
}
}
return outString;
}

Swift 4:
do {
let cleanString = try NSAttributedString(data: htmlContent.data(using: String.Encoding.utf8)!,
options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType],
documentAttributes: nil)
} catch {
print("Something went wrong")
}

It can be more generic by passing encoding type as parameter, but as example as this category:
#implementation NSString (CSExtension)
- (NSString *)htmlToText {
return [NSAttributedString.alloc
initWithData:[self dataUsingEncoding:NSUnicodeStringEncoding]
options:#{NSDocumentTypeDocumentOption: NSHTMLTextDocumentType}
documentAttributes:nil error:nil].string;
}
#end

Related

How apply CSS to a iOS NSAttributedString

I'm working in a iOS project and I've got this HTML text that i need to display in a UITextView:
http://google.com/<br><span class="wysiwyg-color-d24b57 ">TEST<br><i><span class="wysiwyg-color-704938 ">TEST2<br></span></i></span><div class="wysiwyg-text-align-left "><span class="wysiwyg-color-d24b57 "><span class="wysiwyg-color-704938 "><u><span class="wysiwyg-color-3a529c ">Test3<br>TEST4<br><br></span></u><span class="wysiwyg-color-3a529c wysiwyg-font-size-x-large ">Test5</span><u><span class="wysiwyg-color-3a529c "><br><br><br></span></u></span></span></div>
So far, I've created the NSAttributedString and the UITextView:
NSAttributedString *attributedString = [[NSAttributedString alloc]
initWithData: [example.text dataUsingEncoding:NSUnicodeStringEncoding]
options: #{ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType }
documentAttributes: nil
error: nil
];
UITextView *textView = [[UITextView alloc] init];
textView.frame = [node labelFrame];
textView.attributedText = attributedString;
textView.editable = NO;
textView.selectable = YES;
textView.backgroundColor = [UIColor clearColor];
return textView;
Now I have to apply the wysiwyg CSS on it, as at the moment, color, font-size and alignment are not working.
Does anybody have any idea how to to do this?
Thanks in advance!
You can like this way
let divHtml = "<div style=\"font-family:Helvetica;font-size:14px;line-height:22px\" >" + YOUR TEXT + "</div>"
let attributedString = try! NSAttributedString(data: divHtml.dataUsingEncoding(NSUTF8StringEncoding)!, options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType], documentAttributes: nil)
textView.attributedText = attributedString

Check if an NSString is an html string?

I am creating an application that should display a list of strings, the string is returned from the server and can be either an html or not.
I am currently setting the text inside a UILabel. to do that i am using the below
NSMutableAttributedString *attributedTitleString = [[NSMutableAttributedString alloc] initWithData:[title dataUsingEncoding:NSUnicodeStringEncoding] options:#{ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType } documentAttributes:nil error:nil];
cell.label.attributedText =attributedTitleString;
When the text is an html, everything work perfectly since the font and alignments are returned inside the html. The issue is occurring when the text is a normal text. The font, text alignment, text size and others are not being respected anymore.
So how can I check if the text is an html string or not?
I will be using the following in case of normal text:
cell.label.text =title;
I have tried to search on forums but still didn't get any answer for my issue.
This is working fine, you need to put:
cell.label. attributedText = title; incase of normal text too.
As It is working fine. Run the below code.
//If HTML Text
NSString *htmlstr = `#"This is <font color='red'>simple</font>"`;
NSMutableAttributedString *attributedTitleString = [[NSMutableAttributedString alloc] initWithData:[htmlstr dataUsingEncoding:NSUnicodeStringEncoding] options:#{ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType } documentAttributes:nil error:nil];
textField.attributedText =attributedTitleString;
textField.font = [UIFont fontWithName:#"vardana" size:20.0];
//If Normal text.
NSString *normalStr = #"This is Renuka";
NSMutableAttributedString *NorAttributedTitleString = [[NSMutableAttributedString alloc] initWithData:[normalStr dataUsingEncoding:NSUnicodeStringEncoding] options:#{ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType } documentAttributes:nil error:nil];
textField.attributedText = NorAttributedTitleString;
textField.font = [UIFont fontWithName:#"vardana" size:20.0];
You can check if your string contains html tag :
// iOS8 +
NSString *string = #"<TAG>bla bla bla html</TAG>";
if ([string containsString:#"<TAG"]) {
NSLog(#"html string");
} else {
NSLog(#"no html string");
}
// iOS7 +
NSString *string = #"<TAG>bla bla bla html</TAG>";
if ([string rangeOfString:#"<TAG"].location != NSNotFound) {
NSLog(#"html string");
} else {
NSLog(#"no html string");
}
Since the syntax and structure of HTML and XML is quite similar, you can utilize XMLDocument to check whether a given data object is HTML or plain text.
So data is from the type Data and is returned by an URLSession dataTask.
// Default to plain text
var options : [NSAttributedString.DocumentReadingOptionKey: Any] = [.documentType: NSAttributedString.DocumentType.plain]
// This will succeed for HTML, but not for plain text
if let _ = try? XMLDocument(data: data, options: []) {
options[.documentType] = NSAttributedString.DocumentType.html
}
guard let string = try? NSAttributedString(data: data, options: options, documentAttributes: nil) else { return }
Update
The above method works sort of. I had trouble with some HTML not creating an XML document. For the moment I have a different solution, that I don't really like.
As the symptom is that we get only one line back in the attributed string, simply check for that and create a new attributed string if there is only one line.
var options : [NSAttributedString.DocumentReadingOptionKey: Any] = [.documentType: NSAttributedString.DocumentType.html]
guard var string = try? NSAttributedString(data: data, options: options, documentAttributes: nil) else { return }
if string.string.split(separator: "\n").count == 1 {
options[.documentType] = NSAttributedString.DocumentType.plain
guard let tempString = try? NSAttributedString(data: data, options: options, documentAttributes: nil) else { return }
string = tempString
}

How can I remove html tags from a string?

I have a string that contains in it html code
let htmlString = <p style=\"text-align: right;\"> text and text
and I want to ignore the html codes and have a string with only the text.
thank you.
You can remove html tag from string by using NSAttributedString.
Please find the below code :
let htmlString = "<p style=\"text-align: right;\"> text and text"
do {
let encodedData = htmlString.dataUsingEncoding(NSUTF8StringEncoding)!
let attributedOptions : [String: AnyObject] = [
NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding
]
let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
print("final strings :",attributedString.string)
} catch {
fatalError("Unhandled error: \(error)")
}
Hope it works for you!!!
You can also create String extension for reusability:
extension String {
init(htmlString: String) {
do {
let encodedData = htmlString.dataUsingEncoding(NSUTF8StringEncoding)!
let attributedOptions : [String: AnyObject] = [
NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding
]
let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
self.init(attributedString.string)
} catch {
fatalError("Unhandled error: \(error)")
}
}
}
Swift 3.0 - (Xcode 8.2) Update
extension String {
var normalizedHtmlString : String {
do {
if let encodedData = self.data(using: .utf8){
let attributedOptions : [String: AnyObject] = [
NSDocumentTypeDocumentAttribute : NSHTMLTextDocumentType as AnyObject,
NSCharacterEncodingDocumentAttribute: NSNumber(value: String.Encoding.utf8.rawValue)
]
let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
if let stringNormalized = String.init(attributedString.string){
return stringNormalized
}
}
}
catch {
assert(false, "Please check string")
//fatalError("Unhandled error: \(error)")
}
return self
}
}
And call the htmlString method :
let yourHtmlString = "<p style=\"text-align: right;\"> text and text"
let decodedString = String(htmlString:yourHtmlString)
You may try using -[NSAttributedString initWithData:options:documentAttributes:error:];
+ (NSAttributedString*)attributedStringForHTMLStrippingWithHTMLString:(NSString*)htmlString error:(NSError**)error
{
NSAttributedString *result = nil;
NSMutableAttributedString *attributedString = nil;
NSData *htmlStringData = [htmlString dataUsingEncoding:NSUTF8StringEncoding];
NSDictionary *options = #{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: #(NSUTF8StringEncoding)};
attributedString = [[NSMutableAttributedString alloc] initWithData:htmlStringData
options:options
documentAttributes:nil
error:error];
result = [attributedString copy];
return result;
}
+ (NSString*)stripStringOfHTMLTags:(NSString*)htmlString
{
NSString *result = nil;
NSError *error = nil;
NSAttributedString *attributedString = [self attributedStringForHTMLStrippingWithHTMLString:htmlString error:&error];
result = [attributedString string];
return result;
}
Try using a css file like so:
html-file:
<p id="myText" />
css-file:
#myText
{
text-align: right;
}

Parsing HTML into NSAttributedText - how to set font?

I am trying to get a snippet of text that is formatted in html to display nicely on an iPhone in a UITableViewCell.
So far I have this:
NSError* error;
NSString* source = #"<strong>Nice</strong> try, Phil";
NSMutableAttributedString* str = [[NSMutableAttributedString alloc] initWithData:[source dataUsingEncoding:NSUTF8StringEncoding]
options:#{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: [NSNumber numberWithInt:NSUTF8StringEncoding]}
documentAttributes:nil error:&error];
This kind of works. I get some text that has 'Nice' in bold! But... it also sets the font to be Times Roman! This is not the font face I want.
I am thinking I need to set something in the documentAttributes, but, I can't find any examples anywhere.
Swift 2 version, based on the answer given by Javier Querol
extension UILabel {
func setHTMLFromString(text: String) {
let modifiedFont = NSString(format:"<span style=\"font-family: \(self.font!.fontName); font-size: \(self.font!.pointSize)\">%#</span>", text) as String
let attrStr = try! NSAttributedString(
data: modifiedFont.dataUsingEncoding(NSUnicodeStringEncoding, allowLossyConversion: true)!,
options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding],
documentAttributes: nil)
self.attributedText = attrStr
}
}
Swift 3.0 and iOS 9+
extension UILabel {
func setHTMLFromString(htmlText: String) {
let modifiedFont = String(format:"<span style=\"font-family: '-apple-system', 'HelveticaNeue'; font-size: \(self.font!.pointSize)\">%#</span>", htmlText)
let attrStr = try! NSAttributedString(
data: modifiedFont.data(using: .unicode, allowLossyConversion: true)!,
options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue],
documentAttributes: nil)
self.attributedText = attrStr
}
}
Swift 5 and iOS 11+
extension UILabel {
func setHTMLFromString(htmlText: String) {
let modifiedFont = String(format:"<span style=\"font-family: '-apple-system', 'HelveticaNeue'; font-size: \(self.font!.pointSize)\">%#</span>", htmlText)
let attrStr = try! NSAttributedString(
data: modifiedFont.data(using: .unicode, allowLossyConversion: true)!,
options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding:String.Encoding.utf8.rawValue],
documentAttributes: nil)
self.attributedText = attrStr
}
}
#import "UILabel+HTML.h"
#implementation UILabel (HTML)
- (void)jaq_setHTMLFromString:(NSString *)string {
string = [string stringByAppendingString:[NSString stringWithFormat:#"<style>body{font-family: '%#'; font-size:%fpx;}</style>",
self.font.fontName,
self.font.pointSize]];
self.attributedText = [[NSAttributedString alloc] initWithData:[string dataUsingEncoding:NSUnicodeStringEncoding]
options:#{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: #(NSUTF8StringEncoding)}
documentAttributes:nil
error:nil];
}
#end
This way you don't need to specify which font you want, it will take the label font and size.
I actually found a working solution to this problem:
Changing the font in your HTML response string before it gets parsed.
NSString *aux = [NSString stringWithFormat:#"<span style=\"font-family: YOUR_FONT_NAME; font-size: SIZE\">%#</span>", htmlResponse];
Example:
NSString *aux = [NSString stringWithFormat:#"<span style=\"font-family: HelveticaNeue-Thin; font-size: 17\">%#</span>", [response objectForKey:#"content"]];
Swift version:
let aux = "<span style=\"font-family: YOUR_FONT_NAME; font-size: SIZE\">\(htmlResponse)</span>"
A more generic approach is to look at the font traits while enumerating, and create a font with the same traits (bold, italic, etc.):
extension NSMutableAttributedString {
/// Replaces the base font (typically Times) with the given font, while preserving traits like bold and italic
func setBaseFont(baseFont: UIFont, preserveFontSizes: Bool = false) {
let baseDescriptor = baseFont.fontDescriptor
let wholeRange = NSRange(location: 0, length: length)
beginEditing()
enumerateAttribute(.font, in: wholeRange, options: []) { object, range, _ in
guard let font = object as? UIFont else { return }
// Instantiate a font with our base font's family, but with the current range's traits
let traits = font.fontDescriptor.symbolicTraits
guard let descriptor = baseDescriptor.withSymbolicTraits(traits) else { return }
let newSize = preserveFontSizes ? descriptor.pointSize : baseDescriptor.pointSize
let newFont = UIFont(descriptor: descriptor, size: newSize)
self.removeAttribute(.font, range: range)
self.addAttribute(.font, value: newFont, range: range)
}
endEditing()
}
}
Figured it out. Bit of a bear, and maybe not the best answer.
This code will go through all the font changes. I know that it is using "Times New Roman" and "Times New Roman BoldMT" for the fonts.
But regardless, this will find the bold fonts and let me reset them. I can also reset the size while I'm at it.
I honestly hope/think there is a way to set this up at parse time, but I can't find it if there is.
- (void)changeFont:(NSMutableAttributedString*)string
{
NSRange range = (NSRange){0,[string length]};
[string enumerateAttribute:NSFontAttributeName inRange:range options:NSAttributedStringEnumerationLongestEffectiveRangeNotRequired usingBlock:^(id value, NSRange range, BOOL *stop) {
UIFont* currentFont = value;
UIFont *replacementFont = nil;
if ([currentFont.fontName rangeOfString:#"bold" options:NSCaseInsensitiveSearch].location != NSNotFound) {
replacementFont = [UIFont fontWithName:#"HelveticaNeue-CondensedBold" size:25.0f];
} else {
replacementFont = [UIFont fontWithName:#"HelveticaNeue-Thin" size:25.0f];
}
[string addAttribute:NSFontAttributeName value:replacementFont range:range];
}];
}
Swift 4+ update of UILabel extension
extension UILabel {
func setHTMLFromString(text: String) {
let modifiedFont = NSString(format:"<span style=\"font-family: \(self.font!.fontName); font-size: \(self.font!.pointSize)\">%#</span>" as NSString, text)
let attrStr = try! NSAttributedString(
data: modifiedFont.data(using: String.Encoding.unicode.rawValue, allowLossyConversion: true)!,
options: [NSAttributedString.DocumentReadingOptionKey.documentType:NSAttributedString.DocumentType.html, NSAttributedString.DocumentReadingOptionKey.characterEncoding: String.Encoding.utf8.rawValue],
documentAttributes: nil)
self.attributedText = attrStr
}
}
iOS 9+
extension UILabel {
func setHTMLFromString(htmlText: String) {
let modifiedFont = NSString(format:"<span style=\"font-family: '-apple-system', 'HelveticaNeue'; font-size: \(self.font!.pointSize)\">%#</span>" as NSString, htmlText) as String
//process collection values
let attrStr = try! NSAttributedString(
data: modifiedFont.data(using: .unicode, allowLossyConversion: true)!,
options: [NSAttributedString.DocumentReadingOptionKey.documentType:NSAttributedString.DocumentType.html, NSAttributedString.DocumentReadingOptionKey.characterEncoding: String.Encoding.utf8.rawValue],
documentAttributes: nil)
self.attributedText = attrStr
}
}
Yes, there is an easier solution. Set the font in the html source!
NSError* error;
NSString* source = #"<strong>Nice</strong> try, Phil";
source = [source stringByAppendingString:#"<style>strong{font-family: 'Avenir-Roman';font-size: 14px;}</style>"];
NSMutableAttributedString* str = [[NSMutableAttributedString alloc] initWithData:[source dataUsingEncoding:NSUTF8StringEncoding]
options:#{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: [NSNumber numberWithInt:NSUTF8StringEncoding]}
documentAttributes:nil error:&error];
Hope this helps.
The answers above all work OK if you're doing the conversion at the same time as creating the NSAttributedString. But I think a better solution, which works on the string itself and therefore doesn't need access to the input, is the following category:
extension NSMutableAttributedString
{
func convertFontTo(font: UIFont)
{
var range = NSMakeRange(0, 0)
while (NSMaxRange(range) < length)
{
let attributes = attributesAtIndex(NSMaxRange(range), effectiveRange: &range)
if let oldFont = attributes[NSFontAttributeName]
{
let newFont = UIFont(descriptor: font.fontDescriptor().fontDescriptorWithSymbolicTraits(oldFont.fontDescriptor().symbolicTraits), size: font.pointSize)
addAttribute(NSFontAttributeName, value: newFont, range: range)
}
}
}
}
Use as:
let desc = NSMutableAttributedString(attributedString: *someNSAttributedString*)
desc.convertFontTo(UIFont.systemFontOfSize(16))
Works on iOS 7+
Improving on Victor's solution, including color:
extension UILabel {
func setHTMLFromString(text: String) {
let modifiedFont = NSString(format:"<span style=\"color:\(self.textColor.toHexString());font-family: \(self.font!.fontName); font-size: \(self.font!.pointSize)\">%#</span>", text) as String
let attrStr = try! NSAttributedString(
data: modifiedFont.dataUsingEncoding(NSUnicodeStringEncoding, allowLossyConversion: true)!,
options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding],
documentAttributes: nil)
self.attributedText = attrStr
}
}
For this to work you will also need YLColor.swift of the uicolor to hex conversion https://gist.github.com/yannickl/16f0ed38f0698d9a8ae7
Using of NSHTMLTextDocumentType is slow and hard to control styles. I suggest you to try my library which is called Atributika. It has its own very fast parser. Also you can have any tag names and define any style for them.
Example:
let str = "<strong>Nice</strong> try, Phil".style(tags:
Style("strong").font(.boldSystemFont(ofSize: 15))).attributedString
label.attributedText = str
You can find it here https://github.com/psharanda/Atributika
Joining together everyone's answers, I made two extensions that allow setting a label with html text. Some answers above did not correctly interpret the font family in the attributed strings. Others were incomplete for my needs or failed in other ways. Let me know if there's anything you'd like me to improve on.
I hope this helps someone.
extension UILabel {
/// Sets the label using the supplied html, using the label's font and font size as a basis.
/// For predictable results, using only simple html without style sheets.
/// See https://stackoverflow.com/questions/19921972/parsing-html-into-nsattributedtext-how-to-set-font
///
/// - Returns: Whether the text could be converted.
#discardableResult func setAttributedText(fromHtml html: String) -> Bool {
guard let data = html.data(using: .utf8, allowLossyConversion: true) else {
print(">>> Could not create UTF8 formatted data from \(html)")
return false
}
do {
let mutableText = try NSMutableAttributedString(
data: data,
options: [NSAttributedString.DocumentReadingOptionKey.documentType: NSAttributedString.DocumentType.html, NSAttributedString.DocumentReadingOptionKey.characterEncoding: String.Encoding.utf8.rawValue],
documentAttributes: nil)
mutableText.replaceFonts(with: font)
self.attributedText = mutableText
return true
} catch (let error) {
print(">>> Could not create attributed text from \(html)\nError: \(error)")
return false
}
}
}
extension NSMutableAttributedString {
/// Replace any font with the specified font (including its pointSize) while still keeping
/// all other attributes like bold, italics, spacing, etc.
/// See https://stackoverflow.com/questions/19921972/parsing-html-into-nsattributedtext-how-to-set-font
func replaceFonts(with font: UIFont) {
let baseFontDescriptor = font.fontDescriptor
var changes = [NSRange: UIFont]()
enumerateAttribute(.font, in: NSMakeRange(0, length), options: []) { foundFont, range, _ in
if let htmlTraits = (foundFont as? UIFont)?.fontDescriptor.symbolicTraits,
let adjustedDescriptor = baseFontDescriptor.withSymbolicTraits(htmlTraits) {
let newFont = UIFont(descriptor: adjustedDescriptor, size: font.pointSize)
changes[range] = newFont
}
}
changes.forEach { range, newFont in
removeAttribute(.font, range: range)
addAttribute(.font, value: newFont, range: range)
}
}
}
Thanks for the answers, I really liked the extension but I have not converted to swift yet. For those old schoolers still in Objective-C this should help a little :D
-(void) setBaseFont:(UIFont*)font preserveSize:(BOOL) bPreserve {
UIFontDescriptor *baseDescriptor = font.fontDescriptor;
[self enumerateAttribute:NSFontAttributeName inRange:NSMakeRange(0, [self length]) options:0 usingBlock:^(id _Nullable value, NSRange range, BOOL * _Nonnull stop) {
UIFont *font = (UIFont*)value;
UIFontDescriptorSymbolicTraits traits = font.fontDescriptor.symbolicTraits;
UIFontDescriptor *descriptor = [baseDescriptor fontDescriptorWithSymbolicTraits:traits];
UIFont *newFont = [UIFont fontWithDescriptor:descriptor size:bPreserve?baseDescriptor.pointSize:descriptor.pointSize];
[self removeAttribute:NSFontAttributeName range:range];
[self addAttribute:NSFontAttributeName value:newFont range:range];
}]; }
Happy Coding!
--Greg Frame
Here is an extension for NSString that returns an NSAttributedString using Objective-C.
It correctly handles a string with HTML tags and sets the desired Font and Font color while preserving HTML tags including BOLD, ITALICS...
Best of all it does not rely on any HTML markers to set the font attributes.
#implementation NSString (AUIViewFactory)
- (NSAttributedString*)attributedStringFromHtmlUsingFont:(UIFont*)font fontColor:(UIColor*)fontColor
{
NSMutableAttributedString* mutableAttributedString = [[[NSAttributedString alloc] initWithData:[self dataUsingEncoding:NSUTF8StringEncoding] options:#{NSDocumentTypeDocumentAttribute : NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute : #(NSUTF8StringEncoding)} documentAttributes:nil error:nil] mutableCopy]; // parse text with html tags into a mutable attributed string
[mutableAttributedString beginEditing];
// html tags cause font ranges to be created, for example "This text is <b>bold</b> now." creates three font ranges: "This text is " , "bold" , " now."
[mutableAttributedString enumerateAttribute:NSFontAttributeName inRange:NSMakeRange(0, mutableAttributedString.length) options:0 usingBlock:^(id value, NSRange range, BOOL* stop)
{ // iterate every font range, change every font to new font but preserve symbolic traits such as bold and italic (underline and strikethorugh are preserved automatically), set font color
if (value)
{
UIFont* oldFont = (UIFont*)value;
UIFontDescriptor* fontDescriptor = [font.fontDescriptor fontDescriptorWithSymbolicTraits:oldFont.fontDescriptor.symbolicTraits];
UIFont* newFont = [UIFont fontWithDescriptor:fontDescriptor size:font.pointSize];
[mutableAttributedString removeAttribute:NSFontAttributeName range:range]; // remove the old font attribute from this range
[mutableAttributedString addAttribute:NSFontAttributeName value:newFont range:range]; // add the new font attribute to this range
[mutableAttributedString addAttribute:NSForegroundColorAttributeName value:fontColor range:range]; // set the font color for this range
}
}];
[mutableAttributedString endEditing];
return mutableAttributedString;
}
#end
Swift 5 Solution for UILabel and UITextView
extension UITextView {
func setHTMLFromString(htmlText: String) {
let modifiedFont = String(format:"<span style=\"font-family: '-apple-system', 'HelveticaNeue'; font-size: \(self.font!.pointSize)\">%#</span>", htmlText)
let attrStr = try! NSAttributedString(
data: modifiedFont.data(using: .unicode, allowLossyConversion: true)!,
options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding:String.Encoding.utf8.rawValue],
documentAttributes: nil)
self.attributedText = attrStr
}
}
extension UILabel {
func setHTMLFromString(htmlText: String) {
let modifiedFont = String(format:"<span style=\"font-family: '-apple-system', 'HelveticaNeue'; font-size: \(self.font!.pointSize)\">%#</span>", htmlText)
let attrStr = try! NSAttributedString(
data: modifiedFont.data(using: .unicode, allowLossyConversion: true)!,
options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding:String.Encoding.utf8.rawValue],
documentAttributes: nil)
self.attributedText = attrStr
}
}
Usage for UILabel
self.label.setHTMLFromString(htmlText: htmlString)
Usage for UITextView
self.textView.setHTMLFromString(htmlText: htmlString)
Output
Swift 3 String extension including a nil font. The property without font is taken from other SO question, do not remember which one :(
extension String {
var html2AttributedString: NSAttributedString? {
guard let data = data(using: .utf8) else {
return nil
}
do {
return try NSAttributedString(data: data, options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue], documentAttributes: nil)
}
catch {
print(error.localizedDescription)
return nil
}
}
public func getHtml2AttributedString(font: UIFont?) -> NSAttributedString? {
guard let font = font else {
return html2AttributedString
}
let modifiedString = "<style>body{font-family: '\(font.fontName)'; font-size:\(font.pointSize)px;}</style>\(self)";
guard let data = modifiedString.data(using: .utf8) else {
return nil
}
do {
return try NSAttributedString(data: data, options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue], documentAttributes: nil)
}
catch {
print(error)
return nil
}
}
}
Swift Solution
The below approach works. You can very well provide the font family, font size, and color in this approach. Feel free to suggest changes or any better way of doing this.
extension UILabel {
func setHTMLFromString(htmlText: String,fontFamily:String,fontColor:String) {
let modifiedFont = String(format:"<span style=\"font-family: '-apple-system', \(fontFamily); font-size: \(self.font!.pointSize); color: \(fontColor) ; \">%#</span>", htmlText)
do{
if let valData = modifiedFont.data(using: .utf8){
let attrStr = try NSAttributedString(data: valData, options: [NSAttributedString.DocumentReadingOptionKey.documentType : NSAttributedString.DocumentType.html.rawValue], documentAttributes: nil)
self.attributedText = attrStr
}
}catch{
print("Conversion failed with \(error)")
self.attributedText = nil
}
}
Actually, an even easier and cleanr way exists. Just set the font after parsing the HTML:
NSMutableAttributedString *text = [[NSMutableAttributedString alloc] initWithData:[htmlString dataUsingEncoding:NSUTF8StringEncoding]
options:#{
NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: #(NSUTF8StringEncoding)}
documentAttributes:nil error:nil];
[text addAttributes:#{NSFontAttributeName: [UIFont fontWithName:#"Lato-Regular" size:20]} range:NSMakeRange(0, text.length)];

How do i convert NSAttributedString into HTML string?

As the title tells,now i can simple convert HTML into NSAttributedString with initWithHTML:documentAttributes: , but what i want to do here is reverse.
Is there any 3rd party library to achieve this?
#implementation NSAttributedString(HTML)
-(NSString *)htmlForAttributedString{
NSArray * exclude = [NSArray arrayWithObjects:#"doctype",
#"html",
#"head",
#"body",
#"xml",
nil
];
NSDictionary * htmlAtt = [NSDictionary
dictionaryWithObjectsAndKeys:NSHTMLTextDocumentType,
NSDocumentTypeDocumentAttribute,
exclude,
NSExcludedElementsDocumentAttribute,
nil
];
NSError * error;
NSData * htmlData = [self dataFromRange:NSMakeRange(0, [self length])
documentAttributes:htmlAtt error:&error
];
//NSAttributedString * htmlString = [[NSAttributedString alloc]initWithHTML:htmlData documentAttributes:&htmlAtt];
NSString * htmlString = [[NSString alloc] initWithData:htmlData encoding:NSUTF8StringEncoding];
return htmlString;
}
#end
Use dataFromRange:documentAttributes: with the document type attribute (NSDocumentTypeDocumentAttribute) set to HTML (NSHTMLTextDocumentType):
NSAttributedString *s = ...;
NSDictionary *documentAttributes = #{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType};
NSData *htmlData = [s dataFromRange:NSMakeRange(0, s.length) documentAttributes:documentAttributes error:NULL];
NSString *htmlString = [[NSString alloc] initWithData:htmlData encoding:NSUTF8StringEncoding];
This is a swift 4 conversion of #omz answer, hope is useful to anyone landing here
extension NSAttributedString {
var attributedString2Html: String? {
do {
let htmlData = try self.data(from: NSRange(location: 0, length: self.length), documentAttributes:[.documentType: NSAttributedString.DocumentType.html]);
return String.init(data: htmlData, encoding: String.Encoding.utf8)
} catch {
print("error:", error)
return nil
}
}
}