generate questionary from html to a xml - html

I have a logistic web where there is a formulary to submit news package. I'm currently doing test runings on that website and I want to do some for that package formulary... The problem is that I want to do a lot of tests with diferent info in the formularys, and do that editing an xml is a bit tiresome.
I am looking for a tool that can generate a formulary about an html website (or with the .class object) with all the fields there are and where I can fill it and generate an automatic xml to do the tests.
My boss told me that it probably could do it with "wsdl" but i dont have any idea about that. Can you help me with any solution?
I'm working with .NET c#, and for the tests: gladio, watin and NUnit.

I'm not sure what you mean by formulary to submit news package, but if you are looking for sample data to do tests with, here are some sample data generators: http://www.webresourcesdepot.com/test-sample-data-generators/
You can use the sample data to create and save xml files for your tests.
Also, wsdl is short for Web Services Description Language, which is on no use for your purpose.

Related

Activating HTML with Haskell

I have a large pile of lecture notes in raw HTML format. I would like to add interactive content to these notes, in particular incorporating online exercises. I have some experience implementing online exercises as cgi-bin executables compiled from Haskell code running on the server, interacting with a student record file and sending suitable HTML back to the browser, using Text.Xhtml to generate the content. Now I plan to integrate the notes and the exercises.
The trouble is that I don't want to spend ages manually transforming my raw HTML into Haskell code to generate exactly the raw HTML I started with. Instead, I'd like to put my Haskell code and my HTML in the same source file, with placeholders in the latter for content generated by the former. A suitable tool should then transform this file into Haskell source code for (e.g.) a cgi-bin executable which generates the corresponding page.
Before I go hacking up such a piece of kit, I thought I'd ask if there's better technology out there already. The fixed points are the large legacy lump of HTML, the need to implement the assessment of the exercises in Haskell, and the need to interact with student records on the server. The handicap is that I need to use the departmental web server and I can't reconfigure it (ok, maybe I could ask nicely): that's one of the reasons I currently use cgi-bin executables, which are just fine on our server already, but I'm open to other possibilities.
My current plan is to write a (I mean adapt an existing) preprocessor to support a special syntax for defining functions of type
Html -> ... -> Html -> Html
that looks a lot like raw HTML with splice points. Then what I do with my existing raw HTML is indent it a bit and mark the holes.
But would that be a waste of time? Please, please tell me that this question is a duplicate!
There are Haskell frameworks like Yesod and Happstack which use templating engines like you describe.
Have you looked at the haskell wiki at http://www.haskell.org/haskellwiki/HSP or
http://www.haskell.org/haskellwiki/Web/Libraries/Templating ?
They may do what you need.
You might find someting to do the job here: Templating packages for Haskell.
And you should probably look into Snap, Yesod or Happstack for serving the content.
I have a large pile of lecture notes in raw HTML format. I would like to add interactive content to these notes, in particular incorporating online exercises.
There is already a system (called "ActiveHs"), written in Haskell, that allows to put lecture notes and interactive exercises in one file.
See:
http://pnyf.inf.elte.hu/fp/UsersGuide_en.xml
http://pnyf.inf.elte.hu/fp/Constructive_en.xml
I can really say that it is very well written code and completely open source!

Extracting data from PDF or Word using PHP, Java

I need help on this...
Especially since I don't know where to start..
I am an IT undergraduate and, along with my groupmates, is now undergoing on-the-job training in a company.
SCENARIO:
The company asked us to create a program that will generate a report and store it in a database.
The database that will be used is MySQL.
As for what language to use, we are considering VB.Net, Java, PHP.
The program must be able to :
generate a report that will be sent through email to an office
store in a database
collect all reports, collate those reports
generate a new report which will then be sent to their main office
then store it in their own databse...
For now,
we are still trying to determine how the program will run and what language will be used that has the capability of reading and extracting data from a text file (can either be a word document or a PDF file).
The company also wants the program to be online-ready for future expansion.
Now, our problem is
Is there a way to extract data from a PDF or Word file using either Java, PHP, VB then store it in the MySQL DB?
if there is, can it be implemented without using any 3rd party software?
the reason why we chose to use either a PDF or Word file type is that, the file should be printable for archive purposes.
What programming language can we easily use to be able to achieve our problem above?
I would like to apologize if the info I am giving is a bit messed up. I will be giving additional information once we are able to talk wth the company this week.
If there is a problem with the way I posted this, please forgive me. I am just trying my best to provide you with the information the best I could.
I'll answer for Java as it is what I use at work.
You can easily extract text from Word files or build a new Word file with Apache POI
As for PDF, iText or PDFBox both does a pretty nice job.
Why can't you use 3rd party software? If you could, I would recommend something like How to read PDF files using Java?.
Or, to read a .doc file: http://www.roseindia.net/tutorial/java/poi/readDocFile.html
Anyway, if you can't use 3rd party tools, why not read the specifications and figure out how to extract the text from PDF, DOC, and DOCX files?
Here you can find DOC specifications: http://msdn.microsoft.com/en-us/library/cc313118.aspx
Here you can find the PDF format specification: http://www.adobe.com/devnet/pdf/pdf_reference.html
Good luck!

How can I create a well-formatted PDF?

I'm working on automating our company invoicing system. Currently all data is stored in our local MySQL database and someone manually updates an excel spreadsheet and then merges this data into a MS Word template. The goal is to automate this process so that the invoice can be generated from our intranet website as a PDF.
My original plan was to create a template in HTML/CSS and use wkhtmltopdf to generate the PDF but I ran into problems with getting a repeatable header and footer on each page. thead and tfoot aren't supported by Webkit and the fix suggested in this other question does not seem to work either.
So I then stumbled on using XML and XSL-FO, the latter I know nothing about. Is this the best path to take? Are there any libraries or utilities out there that will make converting my HTML+CSS into XML+XSL-FO easier? Are there any other alternatives I'm overlooking?
EDIT
Currently the server is CentOS Linux with a MySQL database. All other code is currently in PHP currently but that may change as the whole system is being revamped. Linux and MySQL will almost certainly remain, though.
For your requirement, XSL-FO might just do the trick. It is much cleaner to produce the pdf's directly from the data, then going the cumbersome html path, unless you need to display the html as well, then you might consider converting from html to pdf, but it will always be messy.
You can get xml results from mysql quite easily (mysql --xml) and then you write one (or several) xsl-fo stylesheet for the data. then, you cannot only produce pdfs, but also postscript files or rtf's with some processors.
XSL-FO has its limitations tho, but for your situation, it should suffice.
I admit, the learning curve can be steep, and maintaining xslt-stylesheets can get very tiring, but as you start knowing more about it, you end up writing less code.
another possibility is to do the whole thing in e.g. java or c# - send select statements and loop the results and iteratively build the pdf using a library like iText.
You could try JODReports or Docmosis as less-code intensive options. You supply Word or OpenOffice Writer documents to act as templates and use these engines to manipulate/populate the templates then spit out the documents in the format(s) you require. This may mean your existing Word-templates can be used directly which should save you some effort/time.
iText is another library that will let you build and pump out PDFs from code. It's pretty good.
If you cloud use ASP.NET for web you can use free ReportViewer library and designer for automated of publishing PDF-s.
Here is some references:
http://gotreportviewer.com
http://weblogs.asp.net/srkirkland/archive/2007/10/29/exporting-a-sql-server-reporting-services-2005-report-directly-to-pdf-or-excel.aspx
If you're OK using .NET and C#, you could use DotPdf from Atalasoft (obligatory disclaimer: I work for Atalasoft and wrote most of DotPdf). The Generating namespace is geared for exactly what you're trying to do: automate report generation. From the very basics, you could just create docs directly with the toolkit or you can create template documents that have unpopulated text fields that you can reload and fill later (see here and here for examples).

Running XSLT on servers?

Currently I am working on a project for a client that compares the difference between two XML files, generates an XML that lists the differences (i.e. if a part in an inventory was <Added>, <Deleted>, or <Modified>) and displays a report in HTML.
I have three transforms that basically transform large vendor-specific XML files to simple generic XML files (schema defined). These generic XML files are then transformed into one generic XML file that shows the differences and then that is transformed to a report.html for display for the user.
Presently for testing, I invoke a .bat file to run all three transforms (using Saxon8.jar). My question is, is it possible to put these transforms on a server and create a HTML page with a one-click action that will let the user upload the vendor-specific XML files, transform them, and display the generated HTML file to the user?
You haven't specified whether you'll be using php, java or ASP.NET, however, the functionality you're looking for is possible in all three cases. Your backend web app should have the necesssary mechanism to accept the file uploaded by the user, save it in some work folder, run the necessary transformation using your chosen language, Jave, C#, php etc. and then write back the HTML.
Is it possible? Yes.
To do it you'd typically use some server-side technology (php, ruby, java) to perform the transforms.
But browser-side XSLT is possible, too.
Apache Cocoon is a powerful XML processing engine.
If you're just doing this one job, then coding a Java servlet to do it is not too difficult. If you're doing lots of similar things, a framework like Cocoon or Orbeon will save you effort in the long run.

.NET template files

I have a application that generates a couple of different mails. These mails are currently build up using a string builder that generates a HTML based string that is the mail content.
This approach is getting messy. The code is objects mixed with HTML, etc, etc. What I'd like is to have a template similar to the one used in for example MVC and then use the output of this template to add to the mail.
Can this be done using for example T4 add how wound that work or should I use another approach for this?
The templates don't have to be editable at runtime even if that would be nice.
It can be done with T4. It would work exactly the way it does in MVC.
You should consider getting the Clarius T4 Editor, as Visual Studio does not come with intellisense for T4 out of the box, and the Clarius T4 editor can provide that.
Have you considered just using XSLT? You can easily create HTML documents from raw XML by applying an XSL Transform in .Net
We store XSL templates in the database, and use a simple NAME/VALUE pair approach to populate the templates in code. It is very effective and flexible, and the output can be piped to the browser easily. Alternatively you could just store the individual templates in files on the server and load them up that way.
Either way, the amount of code to implement this is relatively simple, and .Net has many classes in place to facilitate this. If you want an example of how we implemented this, leave a comment and I will append some code examples to this answer.