Convert webarchive to html - html

I managed to collect the behavior of a complex web site into a webarchive. Thereafter I would like to turn that webarchive into an html set of nested directory. Yet, when I did it both with Waf and with a commercial software bought on the the Apple store, what I get is just the nested directory with the html page at the bottom and no images, nor css nor working links.
If you are interested the webarchive document is at:
http://www.miafoto.it/it/GiroMilano.webarchive
while the weak product of the extraction is at:
http://www.miafoto.it/it/Giromilano/Pagine/default.aspx
and the empty directories above.
In addition to the different look, the webarchive displays the same behavior as the official web site - when a listbox vales is selected and then the button pushed - while the extracted version produces a page with no contents by loading itself rather than the official page.
As you may see the webarchive is over 1MB while the extraction just little over 1 KB.
What is wrong with it and how may I perform such an apparently trivial business with usable results?
Thanks,

textutil -convert html example.webarchive
Be careful — html with files is created in the same folder as webarchive!
Also, I had to open .html with text editor and replace "file:///image.tiff" links (replace "file:///" with "") so they point to relative path.
Also, not all browsers display .tiff images.
Who knew we have Stack Overflow wiki?

I find that this WebArchiveExtractor.app works on my Mac (Mojave OS) –
https://robrohan.github.io/WebArchiveExtractor/

I managed the issue by finding all parameters being submitted in the page and submitting them too in my script, ignoring the webarchive.

To save HTML pages on mac, I use chrome. Download and install it and save your page as HTML. Safari will save the web pages with webarchiveformat and for me, it's very hard to deal with it.

Related

Why isn't my HTML page loading completely?

I recently started my first HTML project with the help of Youtube. I'm a beginner and only saw the basics of Javascript in college.
Just finished writing my HTML project and wanted to upload it for free using Google Drive and drv.tw.
The only problem is that certain images and icons do not load (irregularly) enter image description hereand/or the pages on the navigation bar take too long to switch.
My question would be, is it because of the free domain or did I do something wrong in HTML?
When I open the HTML file in Safari everything works fine.
Since I'm new to the community, I don't know exactly what and how much I have to upload to get help. So have mercy on me :'D.
To help troubleshoot, use your browsers "developer" (F12) mode. Look at the Network view to see why the images aren't displayed. For example: It might be "not found", "not authorized", or other reasons.
From your comment, it would appear that at times the page is rendering before the images are available to display or something else is limiting image files from being presented at all.
Once we know why the browser can't display the images, then the cause can be addressed.
Post the .html .js and .css code
Please update your question and show the folder structure. It should look something like:
-site-
|
-js
-css
-images
Copy and paste in the code below the folder structure and list the images and their size.
I am not familiar with google drive's capabilities.
It is important to use a web host. There are plenty of free or low cost sites available, so I won't go into it.

HTML anchors in a Jupyter notebook on Github

So there is a lot out there about creating anchors in markdown, and creating internal table-of-contents-type anchors in a notebook. What I need though is the ability to access an anchor in my notebook on Github from an external source, e.g.:
https://github.com/.../mynotebook.ipynb#thiscell
I've got a number of interactive tutorials hosted this way, and a single manual that I want to be able to link to sections of the notebooks for. I can add the anchor tags into markdown cells just fine, using:
<a id='thiscell'></a>
but when I try using the link as I wrote above, it just loads the notebook at the top, as if there was no reference to an anchor.
GitHub renders notebooks using a separate domain, render.githubusercontent.com, and integrates the output in a nested frame. This means that any anchors on the GitHub URL won't work, because the framed document is a different URL entirely.
Moreover, the framed content is not easily re-usable, as the result is a cached rendering of the notebook with a limited lifetime. You can't rely on it sticking around for later linking!
So if you need to be able to link to sections in a notebook, you'd be far better off using the Jupyter notebook viewer service, https://nbviewer.jupyter.org/. It supports showing notebooks from any public URL including GitHub-hosted repositories and GitHub gists. You can also just enter your GitHub user name (or username/repository) for quick access.
This notebook viewer is far more feature-rich than the one GitHub uses. GitHub kills all embedded JavaScript, and strips almost all HTML attributes. Any embedded animations are right out. But the Jupyter nbviewer service supports those directly out of the box.
E.g. compare these two notebooks on nbviewer:
https://nbviewer.jupyter.org/github/mjpieters/adventofcode/blob/master/2018/Day%2020.ipynb
https://nbviewer.jupyter.org/github/mjpieters/adventofcode/blob/master/2018/Day%2021.ipynb
with the same notebooks on GitHub:
https://github.com/mjpieters/adventofcode/blob/master/2018/Day%2020.ipynb
https://github.com/mjpieters/adventofcode/blob/master/2018/Day%2021.ipynb
The first one contains an animation at the end, the second has a complicated table made easier to read by use of some HTML styling and anchor links.
I had the same problem. As a workaround, I have delegated the rendering of my notebook to http://nbviewer.jupyter.org. It's just a matter of providing its GitHub public url and clicking Go!
Of course, the internal links still don't work under GitHub, but I have now a functioning notebook somewhere on the web, which is what I actually wanted in the first place.
I hope this applies to your case too.

Some weebly features don't work when exporting to HTML and hosting on a different server

Recently I've been tasked with redesigning a website for the current company I'm working at. I've been using weebly to make the site, and then exporting the HTML to be re-hosted on the company's servers.
However, I've noticed that some functionality in weebly's code has stopped working. I imagine this might be due to weebly hosting some elements on their own servers, but this is merely a beginners best guess.
1. The picture for the logo on the banner does not appear once the HTML is rehosted
For comparison, here's the site while hosted on weebly:
http://mjmacoustique.weebly.com/
and the site on the company's servers:
http://www.mjm.qc.ca/redesign2015/
When weebly hosts, the ''MJM'' image should be on the top left and function as a return to home page button when clicked. However, when it's hosted on the company's server, the image is not found.
2. On Firefox, the background image of the home page is replaced with an all black background
When opened in firefox, it fails to load the background image of the main page.
Any help or solutions to these problems would be greatly appreciated.
Thanks.
I can help with question #1: the logo is hosted on weebly's servers, but in the html it's written in a shortcut method like this example: /uploads/2/6/8/5/26851316/1434298489.png"
the easy workaround would be to keep the weebly version of the site working, in in the html change the src value of the missing images to something like this http://mjmacoustique.weebly.com/uploads/2/6/8/5/26851316/1434298489.png
So you haveto add the http://YOURSITE.weebly.com before all the src values of your images.
otherwise, just load all the images you need on a blank page of the site on your servers, copy image urls of those and replace the urls in the html with that.
Hope that helps?
The firefox issue might also be solved if all your src values are linked correctly but I cannot be sure about that.
When I tried exporting a site from weebly, some assets were missing from the zip it produced. This resulted in some images failing to appear because they simply weren't there. I don't know how often this happens (or if it happens only for some sites), but weebly's export feature definitely seems to have bugs.
I worked around this by using wget to recursively fetch the content that weebly was hosting. Then I hand-copied the missing assets (and only the missing assets) from the directory structure saved by wget and merged them into the directory structure from weebly's export zip. This is time-consuming, but necessary since the directory structure fetched by wget includes dynamically-generated content (meta data for weebly's editor, assets with decorated names, etc) that you probably don't want in the content you host elsewhere.

Download from URL (PATH) - iPhone

I got a little Question.
I'm working on an App, and for that I have to download an HTML File with the including CSS and Images.
And Yeah, there's an API for that (ASIHTTPRequest), but I wan't to publish my App to the App Store and I don't want to use 3rd party API's.
And Parsing the HTML code is a bit hard :(
And It would also work for me, if I could download the whole path of a URL.
For example:
I have this URL: http://example.org/smthg/.
At this path I have:
-index.html
-logo.png
-style.css
And I want to download all this files AUTOMATICALLY, and not every single file.
But I don't think, that you can find out which files are on the server, right? (without BruteForce).
I hope you know what I mean :)
You can use a UIWebView to download the content at the location and hold on to the WebView. You could also use NSURLConnection to download content at a URL if you want to save it unformatted and you have the URL's to the resources.
There's nothing wrong with using 3rd party frameworks, as long as they're good quality frameworks and you use them right. ; ) Apple just gives you the starting blocks to make an app, after all, and using open-source code can really speed up your project.
With that said, ASIHTTPRequest is a bit outdated and not well maintained. Instead, I'd recommend AFNetworking, which supports asynchronous downloads, background downloads, and blocks. See https://github.com/AFNetworking/AFNetworking .
Regarding your specific issue on downloading certain files, however, you might try creating a plist(s) on the server (if its yours that is, or else, bundled within the app perhaps) that would list all of the needed files and their download locations.
However, the issue you're liking going to quickly face- even if your app has all needed files downloaded, it still has to understand what to do with them. If its just HTML content, styles, etc, perhaps you can display it in a UIWebView ? However, be sure that your app is adding some useful functionality besides just being a web browser... (unless, of course, you're making an enhanced web browser... ;)
Good luck!

just wanna highlight some texts when use a browser to view local html

A lot of tutorials which can be downloaded have the file type of .chm, .pdf, .html, etc. I downloaded a Java SE tutorial of Java SE in HTML format. When I use chrome to view it and everything is good. But I just wonder how could I just directly highlight some useful information (e.g. text) when I use chrome to view it? The html files are local, I know that I could use some software to edit it, like using HTML tag <font color:> etc.
But I just want to highlight it directly in the browser like editing it in word. Is there any suggestion? Dose chrome support such kind of plugin? If you still don't understand what i mean, please refer to "clip to evernote", which is a plugin of chrome and can cut the pages and upload them to the evernote server. when I use evernote client to read them, I can directly highlight some words which is useful to me.
It's much more a SuperUser question, but ... There is a lot of plugins for highlighting web pages out there. You could try Yawas or Simple Highlighter
edit: ok, I think I understood better your problem ... Yawas, Simple Highlighter, as well as most other highlighters, don't hightlight on local pages.
I'm not sure there is such an highlighter available for Chrome, then. What I would suggest is to try opening you documentations with Amaya instead of Chrome. It's both the Browser and the Editor from the W3C; and since it has both functionalities, you probably will be able to do what you want on your local pages.
You can save it to your computer by clicking "Open a new tab containing a list of highlights and notes on just this page". Then you can save only the html contents to your computer with the name as you like. Don't try to use ALT to save the list of note because you will never see the contents what you want to save.