(OpenXML) Add data pages to xml package without framework - language-agnostic

lately I've been into combining multiple OpenXML speadsheets via PHPExcel which
showed me that this framework has certain issues which makes it pretty much unusable
for what I want to do (my related SO question).
To make it short: it's hard to guarantee that all formatting features of Excel 2007 will
persist a file merge performed with that particular framework.
Anways, now I'm thinking of a more general approach. I want to open a template XLSX
which contains various formatting and add some plain alphanumeric data worksheets 'at the end' of the workbook.
Is sensefully possible to do the following:
unzip template XLSX
parse XML files
add worksheets
save xml files
rezip files to get valuid XLSX
Any hints or experiences would be highly appreciated.
thanks in advance
K

I haven't worked with .xlsx too much, but I've altered .docx files by manually adding and editing the XML.
The biggest concern with adding new parts to a document is to make sure you update the .rels files. The best way to figure out what needs to be updated is to create a new .xlsx document in Excel, add a worksheet, save the file and then unzip it to see what has changed. You can also use the DocumentReflector tool that comes with the OpenXML SDK if you want to see the internals of the file without having to unzip it.
I found the OpenXML reference manual very helpful when hand editing files because it tells you what elements you have to keep and what elements are optional to make a valid document. It makes it easier to work with when you can remove some of the extraneous elements that Excel adds automatically.

Related

Creating a CSV file with the Report Generation Toolkit in Labview

I want to create .csv files with the Report Generation Toolkit in Labview.
They must actually be .csv files which can be opened with Notepad or something similar.
Creating a .csv is not that hard, it's just a matter of adding the extension to the file name that's going to be created.
If I create a .csv file this way it opens nicely in excel just the way it should, but if I open it in Notepad it shows all kind of characters and it doesn't even come close to the data I wrote to the file.
I create the files with the Labview code below:
Link to image (can't post image yet because I've got to few points)
I know .csv files can be created with the Write to Spreadsheet VI but I would like to use the Report Generation Toolkit because it's pretty easy to add columns and rows to the file and that is something I really need.
you can use the Robust CSV package on the lavag.org forum to read and write 2D arrays to CSV files.
http://lavag.org/files/file/239-robust-csv/
Calling a file "csv" does not make it a CSV file. I never used the toolkit to generate an Excel file, but I'm assuming it creates an XLS or XLSX file, regardless of what extension you give it, which is why you're seeing gibberish (probably XLS, since it's been around for a while and I believe XLSX is XML, not binary).
I'm not sure what your problem is with the write spreadsheet VI. It has an append input, so I assume you can use that to at least add rows directly to a file, although I can't say I ever tried it. I would prefer handling all the data in memory explicitly, where you can easily use the array functions to add rows or columns to the array and then overwrite the entire file.

How do I download contents of an html table generated by play 1.2.7 backend on java in xls

I've generated a table using play's #{list} tag and get pretty decent results. Now I need to be able to generate and download an xls version of the table and have no idea what to do. Any pointers at all will be much appreciated
Well you have various options.
Excel will open HTML files. So instead of rendering your table as HTML you can it to stream it to the browser and set the content type as XLS.
While Excel will open it this it will still be an HTML file rather than an XLS(X) document.
You can generate as CSV from your data model and stream this to the browser. Again this will be a CSV rather than a proper XLS(X) document.
There also seem to be some solutions around which can do it using Javscript. See as a starting point: Generate excel sheet from html tables using jquery
Finally you can can use something like Apache POI or JXLS to generate a 'proper' xls(x) document and stream this to the browser. I have some code here that will export HTML to 'proper' xlsx file if this is the route you wish to go. Workflow is then to create some HTML from your data model and use this to convert to Excel rather than having to programmatically build the Excel document using POI. https://github.com/alanhay/html-exporter

Extract excel metadata in Linux

I have used the "extract" command, but it never was able to find as much information as FOCA found on these excel spreadsheets I am dealing with.
For example, I am using the FOCA application to harvest and download files from the web. Afterwards, it is extracting metadata from all of the files.
With regards to excel files, it appears that these files are containing more metadata than the average pdf file. That being said, FOCA is able to detect printer names, email addresses, and a few other things that are stored within this spreadsheet file. However, I cannot find any way to get this same information in Linux using the "extract" command.
Anyone know a way to extract files within Linux and grab ALL of its metadata? Seems like the extract command may be limited from what I understand.
Thanks,
Excel files store a lot of meta data within the file, so you would have to parse the file itself to get at it. Since you're on Linux and can't use the Excel interop, you could try to use an Excel library like ExcelWriter or something similar. ExcelWriter is written for .Net, so you'd have to use mono.

as3xls > Number of sheets always 0

I'm trying to parse a .xlsx file exported from a google docs. Right now I'm not trying to access it online, I'm manually downloading it and copying inside my application.
I've read the tutorial provided online, and this is the code I have right now:
var contentBA:ByteArray = new ByteArray ();
var fileToLoad:File = File.applicationDirectory.resolvePath("textLabels.xlsx");
var stream:FileStream = new FileStream();
stream.open(fileToLoad, FileMode.READ);
stream.readBytes (contentBA, 0, contentBA.length);
var xls:ExcelFile = new ExcelFile();
xls.loadFromByteArray(contentBA);
trace ("N SHEETS ", xls.sheets.length);
but the number of sheets it's always 0. I tried to change the file and to load the most simple excel ever but it keeps saying 0.
Is it a problem of the ".xlsx" extension? Am I missing something?
AS3XLS was written for the old file format BIFF Office 97 style documents. I've written an XLSX exporter for my work on the AdvancedDataGrid but it's proprietary work so I can't share the code unfortunately. However I can give you some direction. The BIFF format used special codes for encoding things like formatting for cells or formulas, the binary format was seemingly meant to reduce the file size (and perhaps as a form of obfuscation). XLSX instead takes the more open XML approach, creating a BIFF file was complicated and was reverse engineered by the Open Office team before Microsoft ever published the spec for it, the newer XML formats are pretty well documented. Every new office file that ends with the x in it's extension is an archive (just like a zip file, you can open it with any archive tool) with a bunch of XML files inside that define the sheet. I basically took a sample sheet with nothing in it (opened Excel saved a new workbook) then pieced it apart and wrote AS3 classes that corresponded to each of the XML files and each implemented an interface that said it had to have a method to getXMLString() then I wrote a wrapper that would create all the objects and used the container pattern/traversal to build all the XML files needed and used the nochump AS3 zip library to package it together.
A useful tool for inspecting the xlsx or docx or whateverx files can be found here:
http://www.microsoft.com/en-us/download/details.aspx?id=5124
If you're on Mac discussion on one here:
http://openxmldeveloper.org/discussions/development_tools/f/27/p/1494/7453.aspx
Generally the site above was helpful
http://openxmldeveloper.org/
Documentation showing (minimal) examples
http://www.ecma-international.org/news/TC45_current_work/Office%20Open%20XML%20Part%203%20-%20Primer.pdf
NoChump's AS3 Zip library
http://nochump.com/blog/archives/15
Basically for more advanced features like tables or cell spanning I just attempted the change I wanted to be able to make programmatically in a simple Excel file then compared it against another without that new feature using the first tool linked above and implemented the change in the appropriate AS3 class (the one that corresponds to the XML file that changed).
It took about 2 weeks to get the organization of the classes solid but it's absolutely achievable.
ExcelFile is a class from a custom library and it says that supports Excel 2002-2003. What version of Excel do you use?

Convert excel spreadsheet into html using vb.net?

I would like to see some code/tutorials on reading an excel spreadsheet (a calender) into VB.NET. I'm pretty much okay from there. I want to convert it to an HTML table and output it into an html file for inclusion on a website.
Where can I find tutorials OR can someone post some code with a desciption to get me startd?
BONUS:
Is there a better way to include a xls file in a webpage?
There is an excel reader available on Codeproject here. The article covers the Excel file format layout, and the related material from Sun, in respect to how OpenOffice reads in the Excel spreadsheet.
AFAIK, there is no way to include an xls in a webpage unless you are talking about showing the actual data, it may be possible under IE only, to trigger the end user's Excel application, but IMHO that would be a dangerous assumption as that assumes the user has MS-Office installed.
Hope this helps,
Best regards,
Tom.