Converting HTML to doc(x) and / or PDF [closed] - html

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I have to convert html to the doc(x) and pdf format.
I found aspose, but this tool can do a lot of more work than i need, and thats why it isn't really cheap.
Are there similar tools, which can just do this conversion ?
I need this on a Desktopapplication where no word / office is installed
*Just for Info Finally bought asponse words. all other options weren't as good as this tool

Assuming that these are essentially “documents” and not fancy graphical web pages (i.e. you'd like them to be legible, but aren't deeply concerned with the minutiæ of web layout formatting), you can use LibreOffice to convert them; either manually (open, export as…) or using the "headless" mode, e.g.:
soffice -headless -convert-to pdf -outdir pdfs/ *.html
soffice -headless -convert-to doc -outdir docs/ *.html
Free, cross-platform, but a bit of a hefty install. (I think it's nearing the half-gigabyte mark for the full suite with all the plug-ins installed, but you should only need the Writer component)

Maybe this http://kitpdf.com might help. I tried it, it's free and really easy to use.

You can use ABCPdf:
http://www.websupergoo.com/products.htm

I can't speak for docx format, but you might look into DocRaptor to convert HTML to PDF format. It definitely handles CSS styling better than comparable programs, and doesn't just give you an image like creating a PDF with Photoshop.

If the webpage is or can be hosted then you can download an extension for Google Chrome called Screen Capture, this allows you to take a full screen grab of the webpage then you can paste it into Photoshop and Save As a .pdf (that is assuming you have Photoshop that is)

Related

Recommendations of static site generator which accepts Markdown documents? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I'm looking for static site generator which accepts Markdown documents as input source code.
I used Markdoc, but it looks abandoned. And it doesn't copy static file in source document folder. So I'm installing docpad now. Anyway I want to try other implementations. Can you recommend some nice implementation like that site generator?
http://staticsitegenerators.net is a crowd-sourced definitive listing of all the static site generators, their github stars, their website, their language, created and updated dates, etc.
+1 for DocPad, I've found Jekyll to be quite crippling with it's lack of extensibility (not enough markups supported, difficult to filter documents in content listings based on certain criteria, hard to write extensions, etc...)
You can also take a look at nanoc, which is Ruby based and actively being developed, too.
Cabin is a node.js static site generator powered by Grunt. It currently has three beautiful blogging themes available out of the box. Getting started takes like 45 seconds. Here are the available themes, with links to installing each:
Jekyll is quite mature and actively being developped.
Poole is another one. Conceptually it's something in between plain Markdown to HTML conversion and more sophisticated site generators like Hyde.
Poole uses one global HTML skeleton file to inject the HTML versions of Markdown source pages into. Poole has basic support for generating content by embedding Python code in page source files. This is a dirty merge of content and logic but a pragmatic solution to get things done fast for simple sites. No need to learn a template or preprocessing engine.
Poole may be a good choice if you are familiar with Markdown and Python and if you want to build a rather simple site with only a spot of generated content.
Disclaimer: I'm the developer of Poole.
I recently moved my blog from googlesites to node based Wintersmith. I am fairly impressed with the flexibility and markdown support it provides. Also there are several templates and open source referral websites available on their git repository.
If you are on a mac, I recommend Hammer (http://hammerformac.com/). Supports Markdown and also SASS (with Bourbon), CoffeeScript and HAML.

Is there a library for converting Flash / Flex AS3 TextLayoutFormat data to HTML and CSS? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I have the job of recreating a flex app in HTML and CSS. The existing app makes considerable use of TextFlow to layout content. For several reasons I need to be quite accurate (within a few pixels) with positioning.
The current application is loading data which looks like this:
<p paragraphstartindent="0"
textalign="center"><span alignmentbaseline="useDominantBaseline"
backgroundalpha="1"
backgroundcolor="transparent"
baselineshift="0"
breakopportunity="auto"
cffhinting="horizontalStem"
color="0x0"
digitcase="default"
digitwidth="default"
dominantbaseline="auto"
fontfamily="ArialCFF"
fontlookup="embeddedCFF"
fontsize="22"
fontstyle="normal"
fontweight="bold"
kerning="auto"
ligaturelevel="common"
lineheight="120%"
linethrough="false"
locale="en"
renderingmode="cff"
textalpha="1"
textdecoration="none"
textrotation="auto"
trackingleft="0"
trackingright="0"
typographiccase="default">Here is some content which needs to be accurately positioned</span></p>
Ideally I'm looking for a library I can use to translate these many attributes into "proper" html and css. The current technology stack is PHP at the back end and javascript at the front end, but there would be little problem in using any other language to do the translation.
Failing that I guess I'll try and write my own, using the api reference as a guide.
I don't think there's a lib available for that, but from having a quick look at the docs, it should be too hard to translate over. Most of the options you can ignore as they're impossible to do in css (without going into css3 - I'm assuming you want maximum compatibility here) and the rest are pretty basic (colour, font, padding, line-height...)
Maybe Wallaby, the Adobe App to convert FLA files to HTML5/CSS can be helpful if you manage to make it work with your Flex Files... http://labs.adobe.com/technologies/wallaby/
This, of course, would just be a starting point :) but hope it helps.
Unfortunately, you will never get pixel accuracy in HTML text, by design. Font rendering strategies between browsers, and even different browser modes (eg: IE9, Safari for Windows) can have different layouts.
You may be able to export your content to HTML with the TextConverter class.
I would go with quick and easy since you only need formatted DOM Elements (HTML Tags). This is [part of Flash Player 9] somewhat reliable - you might to give it a try...
source : flashx.textLayout.elements:TextFlow
format : String
conversionType : String
(returnHTML as Object) = flashx.textLayout.conversion.export(source,format,conversionType)

Convert pdf, doc, ppt to html5 [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I've googled (without any luck) for open source software that can convert doc, ppt, and pdf to HTML5. (Exactly what Scribd does) Are there open source equivalents to the type of conversion Scribd does?
If anyone knows of a paid service, that would also work. Scribd has an API, but that's for use with the flash viewer. Also, I would like to host my own content as I need further control over converted html document.
You're unlikely to find a single offering that does all this, especially in the open source world. It's more likely that you'll end up relying on a mishmash of things, and may even need to chain some converters in order to get to HTML. (Eg PDF -> ps -> HTML)
OpenOffice supports conversion to HTML, and can be called from the command line.
http://pdftohtml.sourceforge.net/ looks reasonably good at converting pdf to html.
For Doc that is Word ML or OpenXML format it's conceivable that you could use XSLT transforms since both input and output formats are XML. I've seen some stylesheets floating around the net that do this, but YMMV.
Incidentally, why is there a specific requirement for open source? MS Powerpoint already supports save-as-HTML for example.
Open Office will convert pdf to html but you'll take a hit to design quality.
I suggest either: Crocodoc as a paid service (It provides different flavours for different platforms such as Python,Ruby,Java,PHP Developers are allowed to work on their APIs.) or waiting for an official Adobe tool (it's in the works).
For PDF to HTML conversion, pdf2htmlEX seems like a pretty good tool (looking at all the examples/samples):
https://github.com/coolwanglu/pdf2htmlEX
For pdf there is an open source project started by mozilla and it's very good: https://github.com/mozilla/pdf.js/
You can see a hello world example : https://github.com/mozilla/pdf.js/tree/master/examples/helloworld
For the rest of document types I think LibreOffice said that are planning to build something in html5, but so far there isn't anything done.
http://wvware.sourceforge.net/
wvHtml: convert your Word document
into HTML4.0.
Possibly:
http://www.abisource.com/
but in this case it looks like "open doc" > "export html" manually, maybe plugins help. Not sure, what do you mean: "source software that can convert".
Or this:
http://www.zope.org/Members/sf/NuxDocument
Also the pdftohtml will give you an html page output.But you will have to work upon its graphical interface.Since it doesn't seems to be very interactive.
I know the question is bit old however I have found new Open source tool called flaxpaper http://flexpaper.devaldi.com/

Dynamic HTML to PDF [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
I need to be able to convert dynamic HTML (html that is rendered on page load by javascript) to a PDF. I know there are plenty of HTML to PDF converters but none of the ones I have found thus far cope with dynamic HTML.
The given tool should be able to successfully convert the following page - http://www.simile-widgets.org/timeline/
Cheers
Anthony
UPDATE:
I don't need the JavaScript functionality here... i.e. i don't need to be able to interact screen... I just want the finial rendering of the screen to be captured in the PDF - like taking a photo after the page is loaded. And in the example I provided the javascript is only rendering divs to the screen so its nothing that it shouldn't be able to handle as long as it "lets" the "page" render first.
There is no way it can be done. The interfaces available for scripts in PDF are extremely limited compared to the full DOM and BOM access you enjoy in a web browser. Such interaction as you can achieve in PDF is not readily translatable from how it works in a browser and would almost certainly need hand authoring.
Your example page has many effects that PDF, as an essentially static document layout format, simply cannot reproduce at all.
Edit:
I just want the finial rendering of the screen to be captured in the PDF
Ah, OK, that's a far easier and more common problem then.
In that case you'll have to use and automate a real web browser (like Firefox), or a toolkit that provides all the logic of a web browser (like WebKit), then either:
export to PDF, either using built-in tools like ‘Print to file’ in Firefox (with background images/colours turned on) or one of the PDF export add-ons, or
take a image snapsnot of the browser (and include the image in a PDF if you have to)
See these questions for some discussion of browser snapshotting.
The fact that it uses any JavaScript at all means a lot of converters won't work. The JavaScript may be simple, but you still need an interpreter to handle it.
I haven't used it for myself, but you might try wkhtmltopdf. It uses the webkit rendering engine, and I believe it includes full javascript support. You would need to be able to install the software and run the executable, but otherwise it should be fairly straightforward.
You could use a javascript URI to alert the current DOM. eg:
javascript:alert("<html>" + document.documentElement.innerHTML + "</html>")
Copy the HTML and save to a file.
Then run it through the HTML2PDF converter.
dynamic-html-pdf
This is best library for node js convert dynamic html to pdf.
https://www.npmjs.com/package/dynamic-html-pdf
You can probably use PhantomJS or headless chrome.
Try xhtml2pdf. Here's the project page at python.org.

Is there a good website with lessons to learn HTML? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I am looking for recommendations for a starting website to learn how to write HTML code
This question seems a bit weird... what do you mean by "sandbox"?
Usually you simply practice writing HTML by using a text editor and opening the local file from the browser.
start here at w3schools.com. They provide a niftly little sandbox with sample code for all your web design element questions.
Notepad + any broswer - This works well for me. Just save your file to .htm
Or if you want, get FireFox or Opera, go to any site (say, stackoverflow.com or w3schools.com), view the source, edit away and then apply the changes. Don't worry, the changes only affect a single tab and doesn't changes anything on the web.
Sandbox for HTML? you must be kidding.There are no chances of getting hurt even if your HTML goes wrong. So you don't need a sandbox.
Use any decent editor which gives a two-tab view for Source-code and Quick-view, and you are done. You can use MS Frontpage or EditPlus, both offer these features. You don't need to save to see the effect.
Please don't clog the bandwidth for just testing and debugging HTML. It ain't worth it.
Some things don't work with Javascript when served from file:// due to security protocols, and sometimes it can be too much of a pain trying to get a webhost up and running for experimenting with stuff.
http://www.webdevout.net/test
I have found to be a convenient playground tool, with the benefit when you mangle something up and you want help to work out what you did wrong you can post the link to somebody and they can see what you've done without you needing to worry about security, hosting, or firewalls.
I'd say check out these video tutorials from net tuts. It starts off with the very basics and then moves on to more in depth stuff. The tutorials are organized as a 30-day course, where they'll mail you a link to a video tutorial each day. The idea being you'll have learnt html/css within 30 days. But you really don't need to sign up for the mailing service, just take it at your own pace.
http://learncss.tutsplus.com/