Text heavy iOS App. Store text in HTML, Plist, or Other? - html

I'm writing relatively complex iOS app that is very text heavy.
The text is also heavily formatted. It has lots of color, size, font, and spacing changes, as well bulleted lists and other text features you'd expect to see in a very rich website.
The text is displayed on about 40 different views. Some of which display a lot of text, others a little. There is no one template that all the pages follow. (There are some that are similar, but that's not the point.)
Lastly, the text is constantly being changed and updated by an editorial team during development, not so much after release. The text has to be stored on the device, downloading files is not an option.
My question is, what is the best way to store and then render all this text in an iOS App?
My approach
Store all the text content and formatting info in an html file and use
[[NSAttributedString alloc] initWithFileURL:htmlDoc
options:#{
NSDocumentTypeDocumentAttribute:NSHTMLTextDocumentType}
documentAttributes:&attrDict
error:&error];
to create a NSAttributed string and use that to populate UITextViews.*
*Note: I would do some more work before creating the UITextViews. First I would parse it to find the appropriate page number [[Page:1.3]] and then parse the elements in that section [[header]], [[side_scroller]], etc...
I like this approach for two main reasons:
It created a separate copy document that contained all the text
and formatting info.
I'm the only iOS developer, but we have a couple front-end
developers. So when we get slammed with changes that need to be done
in 3.45 minutes, I could have some of the guys help me make the
changes, without having to know all the nuances of UIFont and
related classes. Occasionally, the editors could even make the
changes themselves :)
Minor reasons for liking this approach:
The text can vary so much per page, that creating a new UIFont + Plist entry to store the formatting info seems like a bigger pain than having everything in a .html document. (I could be wrong about this.)
Project managers will inevitably say: "Make this word a little bigger," "This word looks strange, add italics," and "Make everything purple!" HTML/CSS seems like a more flexible solution for quickly implementing these requests.
Downsides of this approach:
NSAttributedString picks up 99% of the HTML attributes I threw at it. It did not pick bullet spacing changes in unordered lists <ul>.
Plists are more performant.
Here are some other approaches I considered:
Plist + UIFont
RTF Document - Originally started with this, but found it hid a lot of what was going on and NSAttributedString wouldn't pick up some of the changes.
XML
Any advice or input would very appreciated.
Notes:
iPad app,
iOS 7,
No Internet Connectivity,
Xcode 5

What I did to store styled text in an iOS app was to write a Mac OS command line tool that opens RTF files and converts them to attributed strings (It's a 1-line call in Mac OS, but not supported in iOS for some reason.) I then use NSCoding to save the attributed strings as binary data, with a special .DATA filetype.
I created a custom UITextView category with a method that knows how to load the text view's attributed text from my custom filetype.
I created a build rule in my project that treats RTF files as source files in a build step and the .DATA filetype as the output, and copies the .DATA files into the build project.
Now, all I have to do is add an RTF file to my project the build process inserts the .DATA version of the styled text into the executable.
The Xcode editor knows how to edit RTF files, so you can edit them right in place in the IDE, OR you can edit them in TextEdit or any editor that supports RTF files.
There are a few things you can put in an RTF that aren't supported in UITextViews. (I don't remember what those are offhand. Sorry.)
I find styled WYSIWYG text much easier to deal with than HTML. You just edit the text, and the build process picks up the changes.
It worked beautifully. Plus, binary NSCoding output is a whole lot more compact than HTML.

I would recommend using web view. It can open files in resource bundle.
You can disable all the links in HTML by implementing delegate method shouldStartLoadWithRequest to return NO.
You might also want to set dataDetectorTypes to UIDataDetectorTypeNone.
That will disable auto link detection in web view

Related

Swift - MacOS NSPasteboard - simultaneously support HTML and plain strings

I have a use-case where I want my application to store clipboard data and later allow the user to paste it out.
The thing is - I want to both support HTML-styled text (while maintaining its styling) and also support simple plain-text.
What I have so far is:
Pasteboard variable setup
let pasteboard = NSPasteboard.general
pasteboard.declareTypes([.html, .string], owner: nil)
Storing user's clipboard data
let copiedString = pasteboard.string(forType: .html) ?? pasteboard.string(forType: .string) {
If, for example, a user copies text from the browser, this method maintains the entire HTML.Alternatively, if text is being copied from applications that don't support HTML (such as XCode, Notes, etc), `copiedString` will simply hold the plain text.
Later down the line - pushing contents back to clipboard
pasteboard.setString(copiedString, forType: .html)
My problem is, when I try to paste the content pushed into my clipboard; it only works in applications that actually support HTML-formatted text, such as Chrome / Microsoft Word.
When I try to paste the content into XCode for example, it simply doesn't spit out anything.
Ideally, I want my clipboard to adjust itself according to the application I'm on - paste the text as HTML if supported by the application, otherwise - paste the plain text.
How can I implement such behavior?
Thanks!

Parse and Display Table from HTML String in NSMutableAttributedString

I have been in the process of converting my Android/Java app into iOS/Swift, and have run into an issue regarding table generation.
The app displays a list of content pulled from a web source as html strings, and displays these in a UITableView (RecyclerView on Android). Creating an NSMutableAttributedString with NSDocumentTypeDocumentAttribute:NSHTMLTextDocumentType works fine for basic html formatting (bold, italics, lists, etc), but I need to render tables as well.
In my Android app, I created blocks of content split at tags, and if it was a table, populated a TableView and inserted it into the correct place in the text using a LinearLayout.
The cells are programmatically generated in Swift, so I pre-process the HTML when pulled from the web and store the NSMutableAttributedStrings in an array along with the estimated heights in order to make scrolling smooth. I ran across the NSTextTable in Apple's documentation, but cannot seem to figure out how to use this class, and whether they can be embedded in an NSMutableAttributedString in order to access the table later.
If I'm totally misunderstanding what that class does, please let me know.

Way To Modify HTML Before Display using Cocoa Webkit for Internationalization

In Objective C to build a Mac OSX (Cocoa) application, I'm using the native Webkit widget to display local files with the file:// URL, pulling from this folder:
MyApp.app/Contents/Resources/lang/en/html
This is all well and good until I start to need a German version. That means I have to copy en/html as de/html, then have someone replace the wording in the HTML (and some in the Javascript (like with modal dialogs)) with German phrasing. That's quite a lot of work!
Okay, that might seem doable until this creates a headache where I have to constantly maintain multiple versions of the html folder for each of the languages I need to support.
Then the thought came to me...
Why not just replace the phrasing with template tags like %CONTINUE%
and then, before the page is rendered, intercept it and swap it out
with strings pulled from a language plist file?
Through some API with this widget, is it possible to intercept HTML before it is rendered and replace text?
If it is possible, would it be noticeably slow such that it wouldn't be worth it?
Or, do you recommend I do a strategy where I build a generator that I keep on my workstation which builds each of the HTML folders for me from a main template, and then I deploy those already completed with my setup application once I determine the user's language from the setup application?
Through a lot of experimentation, I found an ugly way to do templating. Like I said, it's not desirable and has some side effects:
You'll see a flash on the first window load. On first load of the application window that has the WebKit widget, you'll want to hide the window until the second time the page content is displayed. I guess you'll have to use a property for that.
When you navigate, each page loads twice. It's almost not noticeable, but not good enough for good development.
I found an odd quirk with Bootstrap CSS where it made my table grid rows very large and didn't apply CSS properly for some strange reason. I might be able to tweak the CSS to fix that.
Unfortunately, I found no other event I could intercept on this except didFinishLoadForFrame. However, by then, the page has already downloaded and rendered at least once for a microsecond. It would be great to intercept some event before then, where I have the full HTML, and do the swap there before display. I didn't find such an event. However, if someone finds such an event -- that would probably make this a great templating solution.
- (void)webView:(WebView *)sender didFinishLoadForFrame:(WebFrame *)frame
{
DOMHTMLElement * htmlNode =
(DOMHTMLElement *) [[[frame DOMDocument] getElementsByTagName: #"html"] item: 0];
NSString *s = [htmlNode outerHTML];
if ([s containsString:#"<!-- processed -->"]) {
return;
}
NSURL *oBaseURL = [[[frame dataSource] request] URL];
s = [s stringByReplacingOccurrencesOfString:#"%EXAMPLE%" withString:#"ZZZ"];
s = [s stringByReplacingOccurrencesOfString:#"</head>" withString:#"<!-- processed -->\n</head>"];
[frame loadHTMLString:s baseURL:oBaseURL];
}
The above will look at HTML that contains %EXAMPLE% and replace it with ZZZ.
In the end, I realized that this is inefficient because of page flash, and, on long bits of text that need a lot of replacing, may have some quite noticeable delay. The better way is to create a compile time generator. This would be to make one HTML folder with %PARAMETERIZED_TAGS% inside instead of English text. Then, create a "Run Script" in your "Build Phase" that runs some program/script you create in whatever language you want that generates each HTML folder from all the available lang-XX.plist files you have in a directory, where XX is a language code like 'en', 'de', etc. It reads the HTML file, finds the parameterized tag match in the lang-XX.plist file, and replaces that text with the text for that language. That way, after compilation, you have several HTML folders for each language, already using your translated strings. This is efficient because then it allows you to have one single HTML folder where you handle your code, and don't have to do the extremely tedious process of creating each HTML folder in each language, nor have to maintain that mess. The compile time generator would do that for you. However -- you'll have to build that compile time generator.

convert docx with (ordered) list to html

I'm trying to convert a large docx document with several layers' ordered list to an html. (see an example of the document here: http://docdro.id/X1oyfBv You should download it)
I tried the following things, including:
online converters such as html-cleaner and index.html (which only recognize one layer of the list)
save as html - which creates an horrendous file but still doesn't recognize the ol structure.
saved the file as zip and then opened the xml file, but I dont see an easy way to get the ol structure out of the w:... tags
saving it to google docs and running Omar Alzabir's script
http://omaralzabir.com/wp-content/uploads/2014/05/GoogleDocsEmail.jpg
btw. If I create a word file with an ordered list with multiple layers and i convert it, it does recognize it as ol's. But the existing file is not recognized as ol's even if I 'un-list' and list it again. So possibly there is something wrong with how the original document was created (?)
Any suggestions much appreciated:) Or indications as to why this problem occurs
Are you asking how to save a Word-doc in HTML format, with multi-level ordered-lists?
Word-HTML has bugs in its multi-level ordered lists. For the list-items, the indentation tends to be incorrect and inconsistent. There's an example here.
Word-HTML has similar bugs in its multi-level unordered lists. An example is here.
I recently wrote a Python program that fixes these bugs, in Word's HTML. The program is part of WordWebNav (WWN), which is free and open-source.
WWN is an app that converts a Microsoft-Word document to a usable web-page. It adds some missing features in the Word-HTML web-page (e.g., a navigation pane), and it fixes bugs in the Word-HTML.
You can use pandoc : https://github.com/jgm/pandoc
This is an open source universal command line tool to convert markup source based document files.
You can use it as something like that:
pandoc -o output.html input.docx

html idml viewer

I am trying to implement a c# idml to html converter. I've managed to produce a single flat html file similar to the one produced by the indesign export.
What I would like to do is to produce html that will be as similar as possible to the indesign view like an html idml viewer. To do this, I need to find the text that can fit into a textframe, I can extract the story text content but I can't really find a way to split this content into frames/pages.
Is there any way I can achieve that?
Just extracting the text from a story isn't enough. The way the text is laid out is controlled by TextFrames in the Spread documents. Each TextFrame has a ParentStory attribute, showing which story it loads text from, and each frame has dimensions which determine the layout. For unthreaded text frames (ie. one story <> one frame), that's all you need.
For threaded frames, you need to use the PreviousTextFrame and NextTextFrame attributes to create the chain. There is nothing in the IDML to tell you how much text fits in each frame in a threaded chain, you need to do the calculation yourself based on the calculated text dimensions (or using brute force trial and error).
You can find the spreads in the main designmap.xml:
<idPkg:Spread src="Spreads/Spread_udd.xml" />
And the spread will contain one or more TextFrame nodes:
<Spread Self="udd" ...>
<TextFrame Self="uf7" ParentStory="ue5" PreviousTextFrame="n" NextTextFrame="n" ContentType="TextType">...</>
...
</Spread>
Which will in turn link to a specific story:
<Story Self="ue5" AppliedTOCStyle="n" TrackChanges="false" StoryTitle="$ID/" AppliedNamedGrid="n">...</>
(In this example the frames are not threaded, hence the 'n' values.
All this is in the IDML documentation, which you can find with the other InDesign developer docs here: http://www.adobe.com/devnet/indesign/documentation.html
Microsoft and Adobe have proposed a new module for css named Regions which allow you to do flow tekst into multiple containers. Keep in mind that you will never be able to create an html page that looks exactly like an Indesign document.
http://www.w3.org/TR/css3-regions/
For now only IE10 and webkit nightly support it: http://caniuse.com/#feat=css-regions