Currently I am working on a project for a client that compares the difference between two XML files, generates an XML that lists the differences (i.e. if a part in an inventory was <Added>, <Deleted>, or <Modified>) and displays a report in HTML.
I have three transforms that basically transform large vendor-specific XML files to simple generic XML files (schema defined). These generic XML files are then transformed into one generic XML file that shows the differences and then that is transformed to a report.html for display for the user.
Presently for testing, I invoke a .bat file to run all three transforms (using Saxon8.jar). My question is, is it possible to put these transforms on a server and create a HTML page with a one-click action that will let the user upload the vendor-specific XML files, transform them, and display the generated HTML file to the user?
You haven't specified whether you'll be using php, java or ASP.NET, however, the functionality you're looking for is possible in all three cases. Your backend web app should have the necesssary mechanism to accept the file uploaded by the user, save it in some work folder, run the necessary transformation using your chosen language, Jave, C#, php etc. and then write back the HTML.
Is it possible? Yes.
To do it you'd typically use some server-side technology (php, ruby, java) to perform the transforms.
But browser-side XSLT is possible, too.
Apache Cocoon is a powerful XML processing engine.
If you're just doing this one job, then coding a Java servlet to do it is not too difficult. If you're doing lots of similar things, a framework like Cocoon or Orbeon will save you effort in the long run.
Related
This question has been troubling me for the past week. Below, I will list my issue, and the research I have put into it.
The scenario: I was given a .csv file with 5000 rows and three columns. The three columns are defined as:
Site ID|Site Name|Site URL
My task: To create an HTML interface for the designers of the company to rate each site on a scale of 1-5.
My plan of action: I am a new hire. I am getting accustomed to the language I was hired for, which was Objective-C.
My algorithm for the project was to:
Parse the .csv
Remove the "Site Name" variable
Create a new .csv that contains the below variables: Site ID|Site URL|Rating|Image
Display the new .csv (with all aforementioned items) as an HTML page where there are toggles for "Ratings", which when pressed, will log the rating into the .csv which it was imported (or loaded) from.
The "Image" section I will be using a piece of software by the name of Paparazzi (on the Mac OS X operating system) which takes a fully formatted screenshot of the main page and saves it as a PNG file. I plan on using the file extension URL (which is stored locally) and load it into the "Image" column, thus when the designer clicks on the image, he is able to load the image that is stored locally.
My issue: As Objective-C is not entirely a scripting language, I am confused with some of the libraries I may need and/or methods I can implement this. I have the algorithm, but I am wholy unsure with the implementation.
My questions: If you have done a project similar to this before with Objective-C, what tips can you provide for me? How does one load the .csv as a HTML interface where upon edit, it will save this edit into the .csv? Will I need any servers for this, or is everything executable from just a machine? How do you grab an image (stored locally), extract its file extension, and load it onto the .csv?
The most important question: Is this achievable through Objective-C? My reasoning behind it is, I want to advance my knowledge of OC through a task like this. Yes, using Python is easier, but is it possible to do this with Objective-C?
Thank you.
It certainly is achievable, but I doubt you'd really want to go this way. If I understand it correctly, you want to serve the HTML page to others via web browser - that would mean either writing a (simple) http daemon, that would run on the server or writing a CGI script that would communicate with a standard http daemon. Python/PHP/Ruby do this for you readily, so there is much less room for possible errors.
As for
As Objective-C is not entirely a scripting language
I would perhaps rephrase it as
As Objective-C is entirely not a scripting language
I'm working on automating our company invoicing system. Currently all data is stored in our local MySQL database and someone manually updates an excel spreadsheet and then merges this data into a MS Word template. The goal is to automate this process so that the invoice can be generated from our intranet website as a PDF.
My original plan was to create a template in HTML/CSS and use wkhtmltopdf to generate the PDF but I ran into problems with getting a repeatable header and footer on each page. thead and tfoot aren't supported by Webkit and the fix suggested in this other question does not seem to work either.
So I then stumbled on using XML and XSL-FO, the latter I know nothing about. Is this the best path to take? Are there any libraries or utilities out there that will make converting my HTML+CSS into XML+XSL-FO easier? Are there any other alternatives I'm overlooking?
EDIT
Currently the server is CentOS Linux with a MySQL database. All other code is currently in PHP currently but that may change as the whole system is being revamped. Linux and MySQL will almost certainly remain, though.
For your requirement, XSL-FO might just do the trick. It is much cleaner to produce the pdf's directly from the data, then going the cumbersome html path, unless you need to display the html as well, then you might consider converting from html to pdf, but it will always be messy.
You can get xml results from mysql quite easily (mysql --xml) and then you write one (or several) xsl-fo stylesheet for the data. then, you cannot only produce pdfs, but also postscript files or rtf's with some processors.
XSL-FO has its limitations tho, but for your situation, it should suffice.
I admit, the learning curve can be steep, and maintaining xslt-stylesheets can get very tiring, but as you start knowing more about it, you end up writing less code.
another possibility is to do the whole thing in e.g. java or c# - send select statements and loop the results and iteratively build the pdf using a library like iText.
You could try JODReports or Docmosis as less-code intensive options. You supply Word or OpenOffice Writer documents to act as templates and use these engines to manipulate/populate the templates then spit out the documents in the format(s) you require. This may mean your existing Word-templates can be used directly which should save you some effort/time.
iText is another library that will let you build and pump out PDFs from code. It's pretty good.
If you cloud use ASP.NET for web you can use free ReportViewer library and designer for automated of publishing PDF-s.
Here is some references:
http://gotreportviewer.com
http://weblogs.asp.net/srkirkland/archive/2007/10/29/exporting-a-sql-server-reporting-services-2005-report-directly-to-pdf-or-excel.aspx
If you're OK using .NET and C#, you could use DotPdf from Atalasoft (obligatory disclaimer: I work for Atalasoft and wrote most of DotPdf). The Generating namespace is geared for exactly what you're trying to do: automate report generation. From the very basics, you could just create docs directly with the toolkit or you can create template documents that have unpopulated text fields that you can reload and fill later (see here and here for examples).
I have a logistic web where there is a formulary to submit news package. I'm currently doing test runings on that website and I want to do some for that package formulary... The problem is that I want to do a lot of tests with diferent info in the formularys, and do that editing an xml is a bit tiresome.
I am looking for a tool that can generate a formulary about an html website (or with the .class object) with all the fields there are and where I can fill it and generate an automatic xml to do the tests.
My boss told me that it probably could do it with "wsdl" but i dont have any idea about that. Can you help me with any solution?
I'm working with .NET c#, and for the tests: gladio, watin and NUnit.
I'm not sure what you mean by formulary to submit news package, but if you are looking for sample data to do tests with, here are some sample data generators: http://www.webresourcesdepot.com/test-sample-data-generators/
You can use the sample data to create and save xml files for your tests.
Also, wsdl is short for Web Services Description Language, which is on no use for your purpose.
Note: I realize this question has already been asked (with a ruby slant) here: Creating on-demand, print-quality PDFs (preferably in Ruby if feasible). BUT there was no decent answer IMHO.
So as you may have guessed, I am looking to find the best approach to producing HIGH QUALITY, print ready PDF documents programmatically. Our requirements need us to be able to have design documents that define place holders for dynamic content like images and text i.e. some kind of template mechanism.
The suggestion has been to use Adobe's InDesign server, but this seems like an expensive solution not to mention a little overkill for our need.
Are there any alternative, cheaper and more fitting solutions out there? The language of the solution doesn't really matter, just as long as it can be executes on a Windows box.
My suggestion would be to look at XSL-FO or thereabouts...
You create an XML doc that describes what you want and there are various libraries and toolkits (I've used XEP from RenderX) that will convert said XML into PDF.
In real terms what we did was take a large lump of data in XML format, use XSLT - templates in effect - to convert the data to formating objects which XEP renders up into something (a 500 page hotel directory with auto-generated TOC and Index) that has been consumed quite happily by at least three different commercial printers. We did some other smaller documents too from time to time.
Downside with this is that its not even remotely a WYSIWYG solution - you're effectively compiling "source code" to get PDF out the back. Upside is that the base technologies are reasonably generic even if the specific toolkits may be a bit less so.
You can convert XML templates to PDFs with Prince.
Prince is a computer program that
converts XML and HTML into PDF
documents. Prince can read many XML
formats, including XHTML and SVG.
Prince formats documents according to
style sheets written in CSS.
I have and also know many people that have had much success with ReportLab an open source Python PDF library (http://www.reportlab.org/rl_toolkit.html).
Its extremely easy to use and very quick to get started. So worth trying out.
I don't know why no one has suggested using LaTeX for this. It's an extremely popular open format for document design and not hard to set up a template that you can fill in text or image content. While the reference implementation of LaTeX runs as a standalone program, if that sounds like too many moving parts for you there are wrapper libraries for Python and other languages you can call via an API.
Java language and JasperReports
Java: iText
C#: iTextSharp
depends on what you want to publish, but take a look at Pentaho reporting
http://reporting.pentaho.org/
rinohtype is an open-source document processor that is capable of producing high-quality print-ready PDF documents. You can use one of the built-in document templates (book, article) or define your own template. The look of document elements can be configured by means of CSS-like style sheets. The contents of your document can be parsed from reStructuredText or CommonMark files, or you can build the document tree programmatically.
Full disclosure: I am the author of rinohtype.
Is it possible to reuse HTML tags across multiple files, headers and footers for example? Placing them in separate files adds an extra HTTP request, that I'd like to avoid.
I don't want to replicate minor changes in headers and footers across every html file every time a change request comes along.
HTML is not a programming language - it's a markup language. You don't do object-oriented HTML because it isn't object based. This is the whole purpose of a server-side language, so you can make include files and use them in your server-side application.
If you have Apache however, you can use server-side includes which don't require a programming language such as PHP, but it's less flexible:
<!--#include virtual="/footer.html" -->
First, HTML isn't even a programming language, so it's impossible to have "Object-oriented" HTML.
Placing them in separate files adds an
extra HTTP request, that I'd like to
avoid.
If this is the reason for your "without server side code" requirement, then you are mistaken - the client does not fetch the templates that make up a page separately; the server side code will return a single HTML page to the client.
If, on the other hand, you don't have the option to run any server-side code at all and have to make do with static HTML pages, then there's only two options I can think of: iframes (which do result in separate HTTP requests, of course), or some sort of tool that basically runs the equivalent of server-side code to embed your reused templates everywhere and spits out the result to be uploaded to the server. You can have this effect by running a PHP/Apache-with-SSI/JSP/Whatever server on your development machine and using wget to make a static snapshot of the pages.
What I want to do is this. The files can be scattered during development. But I when I'm ready to release, a toolkit should compile the included files into a single html file.
You can use a template language/engine, such as jinja2.
You can layout files in a certain hierarchy, and have templates inherit from other templates, and include other templates, and define reusable macros (closest thing to what you referred to as "reusable tags").
What I want to do is this. The files can be scattered during development. But I when I'm ready to release, a toolkit should compile the included files into a single html file
I know this is late, but CodeKit's .kit language lets you do exactly what you were saying.
http://incident57.com/codekit/help.php
I think the language you've chosen in your question (object oriented HTML) is actually masking the real issue you have here...
What I want to do is this. The files can be scattered during development. But I when I'm ready to release, a toolkit should compile the included files into a single html file.
This sounds like a job for a preprocessor, I don't believe it has anything to do with your webserver or server side technology, as this is a step which would happen before deployment.
There's a number of text pre-processors available eg M4 - hell you could even use the C compiler pre-processor if you wanted. A quick google reveals that there are specialised pre-processors for HTML as well....
Automatic file inclusion, automatic escaping, and whatnot that can be done with automatically inserted headers and footers, chosen based on path patterns.
Seems to fit the bill?
Sure . But these would have to be separate ajax calls form the client . There are lot of javascript mvc frameworks like that do that .
If you want to have include files during development, then compile them into free-standing HTML files, you could do that by spidering your development server with wget: whatever server-side technology you use will combine the files and return the HTML, which wget will saves as one file.
As everithing is object over the technology but not directly, indirectly interacting with the object that are created at different level as per security implementation.
You can do this.
I just released a mature framework called Hypertag that is, in fact, Object Oriented HTML. It is entirely client-side, in continuous development, and allows for very interesting, yet HTML-compatible, advanced solutions for logic and layout.
See http://hypertag.io for more.