I am completely new to this. I want convert PDF files into HTML format and then i have to create interactive online course application that can be downloadable by students via Internet. Is any way to do it or how can i export this html content to Moodle.
I really struck with this........
Most PDF files are somewhat graphics rather than editable documents. It is in this reason that you cannot directly convert them to HTML or moodle. If your PDF file has the text selectable, you might want to just copy the text and paste it into word or directly to moodle.
Related
I have a MediaWiki website that has about 1000 additional files that are in web directories that are auto-indexed. I'd like those pages to appear in the MediaWiki index. I've come up with two approaches:
Write a mediawiki plug-in that creates a page for each directory, with a bulleted list for each item, with a link that downloads the object.
Write a python program that uses the mediaWiki API to create a mediawiki page for each item, with full metadata. I can then extract text and put the extracted text on the page as well as MediaWiki preformatted text.
Some of these documents are quite long, however, and so I'm thinking that another approach would be to extract the text from the PDFs and put it into the MediaWiki index. For the multi-page PDFs, it might even make sense to upload (automatically?) a thumbnail of the PDF first page, or even all of the pages.
So what's the appropriate way to reference PDF files with full text on a MediaWiki website?
Use PdfHandler to expose PDF file metadata to search, and upload those documents as files (using e.g. Pywikibot or importImages.php).
I'm building an online app where you can create/edit documents in a WYSIWYG html editor(I'm using nicedit) and save it on the server.
Now, how can I save the contents in the editor?
I have a choice to save it as html type document but I can't save the images added through url(from any other site) to editor(nicedit) permanently because it uses tag to. Also, I'm not getting any good way to convert the image to dataurl so that it can be permanently added to tag.
Please help!
Basically a simple question:
In php you have mpdf/tcpdf etc libraries that convert your HTML/CSS as is into a pdf file.
I have now a JSP page from where i open up a popup JSP including kind of a organizational chart with divs created from HTML/CSS. Is it possible that i can just take this whole popup and convert it somehow into a pdf file -> in chrome you have the option of save as pdf -- and that works it creates a pdf file successfully ! But i want the website to create a pdf itself without browser plugin. Is it possible? Does Java/JSP have such an option to convert pure HTML/CSS to pdf?
You could use Wkhtmltopdf to convert from HTML to PDF. https://code.google.com/p/wkhtmltopdf/
I want to use IcePDF or PDFBox to extract content from PDF. But I don't now the way to continue generating HTML web pages from the text and images extracted.
You can convert pdf to html with PDFBox. Try this link.
By adding -html as parameter when you extract text, you will get html of the pdf. But it will not contain any image, graphics and other details. It will be only the text extracted from the pdf in html format.
If you want to create the exact look and feel of the pdf, there is no single step method in PDFBox. In my knowledge no library provides this facility to create exact html of the pdf. But using PDFBox you can extract images, text and its details. Using these details you have to create a logic to produce the html. We have done a project to convert pdf to html for azzist.com. We have accomplished the conversion using PDFBox. In azzist we are converting the resume to html format. (Still some font issues are there).
Scribd, google, dropbox, zoho etc have accomplished this conversion in a better way. You can have a look at any of these sites to check how they have accomplished this. (You will not get the logic. You have to find it out).
We are in the process of developing a CRM application and for that we need to upload *.doc and *.docx files and display that contents.
We successfully uploaded the *.doc and *.docx files in application by using FileReference and FileReferenceList. Would you please tell me some idea to read the contents from *.doc and *.docx files and to display the uploaded file content into flex text area.
Thanks in Advance.
This link shows one way to open a doc or docx through flash, but it will not output the text into a flex text area. In order to do that, you're going to have to decode the word file, which is not an easy task. I know of an as3 library for opening excel documents, which can give you an idea as to what you'll have to do, but I'm not aware of an existing library that will read word files for you.