iphone uiwebview download complete page with CSS and Images - html

In my app there's a uiwebview that loads a URL.
I am using the following line to save the HTML of the page loaded locally to be able to view it offline:
NSString* html=[webView stringByEvaluatingJavaScriptFromString:#"document.getElementsByTagName('html')[0].innerHTML"]
The problem is that only the HTML of the document gets saved. I want to save also the images and the CSS along with the HTML so that the user see the page as if they are online.
Just like "save web page complete" or something like that, that we're used to in the browsers.

There is no easy way. Regex the HTML using RegexKitLite (http://regexkit.sourceforge.net/RegexKitLite/index.html) and snag all the urls to .jpg,.gif,.png, and .css and .js and whatever all else you need.
alternately, call:
NSString* imgUrls=[webView stringByEvaluatingJavaScriptFromString:#"document.getElementsByTagName('img')"]
or something like that, I'm no javascript whizz... and then deal with whatever all that returns ;)
Sorry. It's a pain in the rearheinie.
edit:
Save all the img's on the iphone, also save the html file. When you want to reload the page, load the html from a file into a string, and then use
- (void)loadHTMLString:(NSString *)string baseURL:(NSURL *)baseURL
to load the HTML string. baseURL is used to specify the directory or site the webview will imagine the html string you hand it is located. All URLS will be relative to that.
Note, of course that this will not work very well for absolute URLs, only for relative ones. So this, in your html file, will monkey things up:
<img src="http://google.com/f/r/i/g/img.gif">
while this would be ok:
<img src="f/r/i/g/img.gif">
Again, this whole solution is mucky.
You might look into a pre-existing open source recursive html spider. I think wget does what you want, but I doubt it can be compiled for iPhone without a -lot- of hassle.

I didn't have time to check, but ASIWebPageRequest seams very promising. It states it can "Store a complete web page in a single string, or with each external resource in a separate file referenced from the page"
ASIWebPageRequest1
I will be using it on one of my projects, and then update thread.
Gonso

Related

Creating multiple HTML pages in 1 file?

I've just started learning HTML and I'm I've begun to make my own website (for fun) and was wondering how I load a certain video depending on the '/'.
localhost/video1
localhost/video2
rather than having, possibly, hundreds of .html files for each video, is there a way to simply get the video/file depending on the contents as the code is the same, the video is only different.
There are several options:
You could use Javascript to get anchor hashes for your page. The urls are then looking like localhost/#video1
How can you check for a #hash in a URL using JavaScript?
Use instead of plain HTML files, for example php. With mod_rewrite you can rewrite localhost/video1.php into localhost/video1 Removing the .php extension with mod_rewrite
A third option is to use a router library like http://smalljs.org/client-side-routing/page/ or http://projects.jga.me/routie/

Including images in a Genshi/Trac template

I am trying to include some images in a Genshi template for my Trac plugin, but it always shows only the alternative text because it cannot find the images.
I have the following (X)HTML code:
<div>
<img src="file://c:/path/to/image.png" alt="asdf" />
</div>
When I use this code with a simple html file and open it in the browser, the image is displayed correctly, which means that both the path and syntax are correct.
But when I insert the code snippet into a Genshi template and use it within Trac, the image cannot be found. However, when I look at the HTML source code in the web browser and copy the URLs into a new browser tab, it is again displayed correctly. This means that only the server cannot find the image.
The images are in a directory inside the python-egg file, and the path points directly to the directory created by Trac, which also contains my CSS and HTML files, both of which are loaded correctly. The images are correctly referenced in the setup script which creates the egg.
How do I have to reference images in (X)HTML documents when using them with a server?
Is there a special way to include images in Genshi documents? (I haven't found one.)
Thanks to the comment of RjOllos and this site I was able to fix it by trying all of the URL types. Although it says for a plugin to be /chrome/<pluginname>, it was actually just /chrome that worked. See the edit below! So the full URL is then <ip>:<port>/chrome/path/to/image.png.
EDIT: I discovered I actually used the /chrome/pluginname version, just that I did not use the name of my plugin as "pluginname". See my comment below. It seems like /chrome/pluginname should actually be /chrome/htdocsnameor something like that, in case you use a different name rather than the plugin name when implementing the ITemplateProvider. In my case I called it images, which was the same name as the folder. END OF EDIT
Another mistake I made was forgetting the initial slash (chrome/path/to/image.png), which caused Trac to assemble the URL to <ip>:<port>/<current page>/chrome/path/to/image.png.

Sitecore and HTML files

We have a bunch of files (css, js, html, flash, swf, etc) put together by a third party to show videos on our site. This link is an example of the type of rendered output that i'm talking about - http://www.esi-intl.com/public/us/resources/virtualclassroom/presentation.htm. This isn't my company but I was able to find this via google since our site is not live yet.
Our editors would like to include these files in the Media Library and display these pages from there. I've tried to include these files but the HTML page doesn't render instead it is offered as a download. I've tried commenting out the Mime type in the Mimetype config file but I'm not having any luck.
Can the Sitecore media handler be modified to get these HTML files to render as pages?
Thanks
Html in media libary could.
To get the correct mime-type look at the setting name="Media.RequestExtension" set the value to "" then you get the original extension.
That makes things become easier for the web server to give the correct mime type.

Is there a way to export a page with CSS/images/etc using relative paths?

I work on a very large enterprise web application - and I created a prototype HTML page that is very simple - it is just a list of CSS and JS includes with very little markup. However, it contains a total of 57 CSS includes and 271 javascript includes (crazy right??)
In production these CSS/JS files will be minified and combined in various ways, but for dev purposes I am not going to bother.
The HTML is being served by a simple apache HTTP server and I am hitting it with a URL like this: http://localhost/demo.html and I share this link to others but you must be behind the firewall to access it.
I would like to package up this one HTML file with all referenced JS and CSS files into a ZIP file and share this with others so that all one would need to do is unzip and directly open the HTML file.
I have 2 problems:
The CSS files reference images using URLs like this url(/path/to/image.png) which are not relative, so if you unzip and view the HTML these links will be broken
There are literally thousands of other JS/CSS files/images that are also in these same folders that the demo doesn't use, so just zipping up the entire folder will result in a very bloated zip file
Anyway -
I create these types of demos on a regular basis, is there some easy way to create a ZIP that will:
Have updated CSS files that use relative URLs instead
Only include the JS/CSS that this html references, plus only those images which the specific CSS files reference as well
If I could do this without a bunch of manual work, if it could be automatic somehow, that would be so awesome!
As an example, one CSS file might have the following path and file name.
/ui/demoapp/css/theme.css
In this CSS file you'll find many image references like this one:
url(/ui/common/img/background.png)
I believe for this to work the relative image path should look like this:
url(../../common/img/background.png)
I am going to answer my own question because I have solved the problem for my own purposes. There are 2 options that I have found useful:
Modern browsers have a "Save Page As..." option under the File menu, or in Chrome on the one menu. This, however does not always work properly when the page is generated by javascript
I created my own custom application that can parse out all of the CSS/Javascript resources and transform the CSS references to relative URLs; however, this is not really a good answer for others.
If anyone else is aware of a commonly available utility or something like that which is better than using the browser built in "Save page as..." option - feel free to post another answer.

Usage of audio and video tag in Nitrogen

Still working on my personal web server, I was trying to use the html5 audio and video tags within Nitrogen.
As there is no #audio nor #video records, I decided to insert html text directly in the page generated by nitrogen, the result looks like this:
<audio controls preload="metadata"><source src="../../My Music/subdir/song.ogg" type="audio/ogg" /source>audio tags not supported</audio>
In my understanding this should work because the audio tag is supposed to be interpreted directly by the client browser, and there is not any nitrogen id or event observer in the code.
But when I browse this code from Firefox, I briefly see the control opening, and then the audio element simply disappears.
If I copy paste the whole code generated by nitrogen (display html source page, copy and paste in a file located at the origin of the nitrogen project) and open it with the browser, it works fine. The relative path is correct, assuming that the search stats in nitrogen project. I have tried absolute path also, without success.
I don't know
if it builds a file name of the form ".._.._My music_subdir_song.ogg" like nitrogen does for url analysis,
or if it uses another directory to start the path,
or if it simply doesn't work the way I am thinking.
...
Edit: some complementary information:
I have done the following changes:
create one directory including some ogg files in the site/static directory + move a static test.html file in the site/static. If I open directly test.html -> ok. if I redirect from my web site -> Not ok.
same test with a copy of the directory at the Nitrogen application root and access from my web site -> not ok
As the information on the web page is ambiguous, I modified test.html to access to a file that does not exist on my PC -> same behavior.
I think I'll use the debugger to understand how the request is managed, to be continued...
Edit 2:
using the debugger I can verify that the wf_core:run_catched() is called several times. The fist call is when it process the event in my page that redirect to the static file.
The second time to process the static html file itself.
A third time to process finish_static_request() with a Path equal to my_music/song.ogg, and then I get lost in the processing of the answer. Another wf_core:run_catched() was called in parallel, but I didn't follow it...
I have been able to verify that the file can be accessed: I have added several audio tag in the html files, and I was able to "download" the existing files using the DownloadHelper Firefox plugin.
My understanding now is that the path is correct (at least when I place the files in a subdirectory of site/static), the server is able to retrieve the files and send them, the browser recognize the audio and video tags, but the link between the embedded audio/video reader and the files is lost, although I have added a type definition inside the audio tag.
Any idea to continue?
Edit 3:
Finally I got it. As Chops suggest it I had to go in the inets server configuration, not to define the path, but to define the type association. I have added the following definitions in etc/inets_httpd.erlenv, and it works.
{mime_types, [
{"css", "text/css"},
...
{"ogg","audio/ogg"},
{"webm","video/webm"}
]}
:o)
Based on the contents of the url attribute ("../../My Music/subdir/song.ogg"), the problem, when it's served from Nitrogen is that the request (Assuming you're using the default 127.0.0.1:8000) for the audio will be to the url "http://127.0.0.1:8000/My Music/subdir/song.ogg"
What you want to to do, if you're using the standard Nitrogen installation, is to put the song files you want into the site/static directory, perhaps in "songs" subdirectory.
Then change the url attribute to be "/songs/mysong.ogg" (or whatever path within site/static you used).
Note: Dependinding on your server choice (Webmachine, for example), you may need to tinker with the server's specific config file to tell it to handle the new directory for static paths, for help, check the configuration docs on the Nitrogen site.
Beyond that, there's nothing special about outputting raw HTML in Nitrogen. It is my understanding that the problem here is really just related to the paths of the requests being sent to the server.